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ENCODED COMBINATORIAL CHEMICAL LIBRARIES 
Description 



Technical Field 

The present invention relates to encoded chemical 
libraries that contain repertoires of chemical 
structures defining a diversity of biological 
structures, and methods for using the libraries. 

Background 

There is an increasing need to find new molecules 
which can effectively modulate a wide range of 
biological processes, for applications in medicine and 
agriculture. A standard way for searching for novel 
bioactive chemicals is to screen collections of 
natural materials, such as fermentation broths or 
plant extracts, or libraries of synthesized molecules 
using assays which can range in complexity from simple 
binding reactions to elaborate physiological 
preparations. The screens often only provide leads 
which then require further improvement either by 
empirical methods or by chemical design. The process 
it time-consuming and costly but it is unlikely to be 
totally replaced by rational methods even when they 
are based on detailed knowledge of the chemical 
structure of the target molecules. Thus, what we 
might call "irrational drug design" - the process of 
selecting the right molecules from large ensembles or 
repertoires - requires continual improvement both in 
the generation of repertoires and in the methods of 
selection. 

Recently there have been several developments in 
using peptides or nucleotides to provide libraries of 
compounds for lead discovery. The methods were 
originally developed to speed up the determination of 
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epitopes recognized by monoclonal antibodies. For 
example, the standard serial process of stepwise 
search of synthetic peptides now encompasses a variety 
of highly sophisticated methods in which large arrays 
5 of peptides are synthesized in parallel and screened 

with acceptor molecules labelled with fluorescent or 
other reporter groups. The sequence of any effective 
peptide can be decoded from its address in the array. 
See for example Geysen et al . , Proc . Natl . Acad . Sci . USA , 
10 81:3998-4002 (1984); Maeji et al., J- Immunol .Met. . 

146:83-90 (1992); and Fodor et al-, Science . 251: 767- 
775 (1991) . 

In another approach, Lam et. al., Nature, 354:82- 
84 (1991) describes combinatorial libraries of 

15 peptides that are synthesized on resin beads such that 

each resin bead contains about 20 pmoles of the same 
peptide. The beads are screened with labelled 
acceptor molecules and those with bound acceptor are 
searched for by visual inspection, physically removed, 

20 and the peptide identified by direct sequence 

analysis. In principle, this method could be used 
with other chemical entities but it requires sensitive 
methods for sequence determination. 

A different method of solving the problem of 

25 identification in a combinatorial peptide library is 

used by Houghten et al. , Nature . 354:84-86 (1991). 
For hexapeptides of the 20 natural amino acids, 400 
separate libraries are synthesized, each with the 
first two amino acids fixed and the remaining four 

30 positions occupied by all possible combinations. An 
assay, based on competition for binding or other 
activity, is then used to find the library with an 
active peptide. Then twenty new libraries are 
synthesized and assayed to determine the effective 

35 amino acid in the third position, and the process is 
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reiterated in this fashion until the active 
hexapeptide is defined. This is analogous to the 
method used in searching a dictionary; the peptide is 
decoded by construction using a series of sieves or 
5 buckets and this makes the search logarithmic, 

A very powerful biological method has recently 
been described in which the library of peptides is 
presented on the surface of a bacteriophage such that 
each phage has an individual peptide and contains the 

10 DNA sequence specifying it. The library is made by 

synthesizing a repertoire of random oligonucleotides 
to generate all combinations, followed by their 
insertion into a phage vector. Each of the sequences 
is cloned in one phage and the relevant peptide can be 

15 selected by finding those that bind to the particular 

target. The phages recovered in this way can be 
amplified and the selection repeated. The sequence of 
the peptide is decoded by sequencing the DNA. See for 
example Cwirla et al., Proc . Natl . Acad . Sci . USA . 

20 87:6378-6382 (1990); Scott et al. , Science , 249:386- 

390 (1990); and Devlin et al., Science . 249:404-406 
(1990) • 

Another "genetic" method has been described where 
the libraries are the synthetic oligonucleotides 

25 themselves wherein active oligonucleotide molecules 
are selected by binding to an acceptor and are then 
amplified by the polymerase chain reaction (PCR) . PCR 
allows serial enrichment and the structure of the 
active molecules is then decoded by DNA sequencing on 

3 0 clones generated from the PCR products. The 

repertoire is limited to nucleotides and the natural 
pyrimidine and purine bases or those modifications 
that preserve specific Watson-Crick pairing and can be 
copied by polymerases. 

35 The main advantages of the genetic methods reside 
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in the capacity for cloning and amplification of DNA 
sequences , which allows enrichment by serial selection 
and provides a facile method for decoding the 
structure of active molecules. However, the genetic 
5 repertoires are restricted to nucleotides and peptides 

composed of natural amino acids and a more extensive 
chemical repertoire is required to populate the entire 
universe of binding sites. In contrast, chemical 
methods can provide limitless repertoires but they 
10 lack: the capacity for serial enrichment and there are 
difficulties in discovering the structures of selected 
active molecules. 



Brief Summary of the Invention 

15 The present invention provides a way of combining 

the virtues of both of the chemical and genetic 
methods summarized above through the construction of 
encoded combinatorial chemical libraries , in which 
each chemical sequence is labelled by an appended 

20 "genetic" tag, itself constructed by chemical 

synthesis, to provide a "retrogenetic" way of 
specifying each chemical structure. 

In outline, two alternating parallel 
combinatorial syntheses are performed so that the 

25 genetic tag is chemically linked to the chemical 
structure being synthesized; in each case, the 
addition of one of the particular chemical units to ' 
the structure is followed by the addition of an 
oligonucleotide sequence r which is defined to "code" 

30 for that chemical unit, ie. , to function as an 

identifier for the structure of the chemical unit. 
The library is built up by the repetition of this 
process after pooling and division. 

Active molecules are selected from the library so 

35 produced by binding to a preselected biological 
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molecule of interest or by identifying a species in 
the library having a desired activity, including 
binding, activation, chemical catalysis and the like. 
Thereafter, the identity of the active molecule is 
5 determined by reading the genetic tag, i.e., the 

identifier oligonucleotide sequence. In one 
embodiment, amplified copies of their retrogenetic 
tags can be obtained by the polymerase chain reaction . 
The strands of the amplified copies with the 

10 appropriate polarity can then be used to enrich for a 

subset of the library by hybridization with the 
matching tags and the process can then be repeated on 
this subset. Thus serial enrichment is achieved by a 
process of purification exploiting linkage to a 

15 nucleotide sequence which can be amplified. Finally, 

the structure of the chemical entities are decoded by 
cloning and sequencing the products of the PGR 
reaction. 

The present invention therefore provides a novel 

20 method for identifying a chemical structure having a 

preselected binding or catalysis activity through the 
use of a library of bifunctional molecules that 
provides a rich source of chemical diversity. The 
library is used to identify chemical structures 

25 (structural motifs) that interact with preselected 

biological molecules. 

Thus, in one embodiment, the invention 
contemplates a bifunctional molecule according to the 
formula A-B-C, where A is a chemical moiety, B is a 

3 0 linker molecule operatively linked to A and C, and C 

is an identifier oligonucleotide comprising a sequence 
of nucleotides that identifies the structure of 
chemical moiety A. 

In another embodiment, the invention contemplates 

35 a library comprising a plurality of species of 
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bifunctional molecules f thereby forming a repertoire 
of chemical diversity. 

Another embodiment contemplates a method for 
identifying a chemical structure that participates in 
5 a preselected chemical or biochemical interaction with 

a biologically active molecule, such as binding or 
catalysis , where the chemical structure is present in 
the library of bifunctional molecules according to 
this invention. The method comprises the steps of: 
10 a) admixing in solution the library of 

bifunctional molecules with the biologically active 
molecule under binding conditions for a time period 
sufficient to form a binding reaction complex; 

b) isolating the complex formed in step 

15 (a) ; and 

c) determining the nucleotide sequence of 
the polymer identifier oligonucleotide in the isolated 
complex and thereby identifying the chemical structure 
that participated in the preselected binding 

20 interaction* 

The invention also contemplates a method for 
preparing a library according to this invention 
comprising the steps of: 

a) providing a linker molecule B having 
25 termini A r and C 1 according to the formula A* -B-C 

that is adapted for reaction with a chemical precursor 
unit X 1 at termini A 1 and with a nucleotide precursor 
Z 1 at termini C 1 ; 

b) conducting syntheses by adding chemical 
30 precursor unit X 1 to termini A 1 of said linker and 

adding precursor unit identifier oligonucleotide Z 1 to 
termini C of said linker, to form a composition 
containing bifunctional molecules having the structure 
x„-B-z n ; 

35 c) repeating step (b) on one or more 
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aliquots of the composition to produce aliquots that 
contain a product containing a bifunctional molecule; 

d) combining the aliquots produced in step 
(c) to form an admixture of bifunctional molecules, 
5 thereby forming said library. 

In a related embodiment, the invention describes 
a bifunctional solid support for synthesizing an 
oligopeptide/ oligonucleotide conjugate comprising: 
(1) a solid support, the solid support being of a type 

10 which is dispersible in aqueous solution, (2) a first 

linkage unit coupled to the solid support, (3) a 
second linkage unit coupled to the first linkage unit, 
and (4) a bifunctional unit coupled to the second 
linkage unit, wherein the bifunctional unit having a 

15 first leaving group employable for oligopeptide 

synthesis and a second leaving group employable for 
oligonucleotide synthesis, wherein the first leaving 
group is N-FMOC or its functional equivalent, the 
second leaving group is O-DMT or its functional 

20 equivalent, wherein the second linkage unit is coupled 

to the first linkage unit by means of a bond cleavable 
by exposure to concentrated aqueous ammonia, and 
wherein the solid support, the first linkage unit, the 
second linkage unit, the cleavable bond, and the 

25 bifunctional unit, exclusive of the first and second 

leaving groups, each being substantially chemically 
unreactive to conditions employed by oligopeptide 
synthetic protocols using FMOC leaving groups and 
conditions employed by oligonucleotide synthetic 

3 0 protocols using O-DMT leaving groups. 

Further contemplated is an element of a library 
of oligopeptide/oligonucleotide conjugates comprising: 
(1) a solid support being of a type which is 
dispersible in aqueous solution, (2) a first linkage 

3 5 unit coupled to the solid support, (3) a second 
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linkage unit coupled to the first linkage unit, (4) a 
bifunctional unit coupled to the second linkage unit, 
(5) an oligopeptide attached to the bifunctional unit, 
and (6) an oligonucleotide attached to the 
5 bifunctional unit. 

Brief Description of the Drawings 

In the drawings, forming a portion of this 
disclosure: 

10 Figure 1 illustrates a scheme for the restriction 

endonuclease cleavage of a PCR amplification product 
derived from a bifunctional molecule of this invention 
(Step 1) , and the subsequent addition of biotin to the 
cleaved PCR product (Step 2) . 

15 Figure 2 illustrates the process of producing a 

library of bifunctional molecules according to the 
method described in Example 9. 

Figure 3 illustrates the structure of a preferred 
bifunctional linker-support molecule based on a 

20 controlled pore glass (CPG) support designated "bf- 

CPG" that is useful in practicing the present 
invention and is described in Example 3B. 

Detailed Description of the Invention 
25 A. Encoded Combinatorial Chemical Libraries 

An encoded combinatorial chemical library is 
a composition comprising a plurality of species of 
bifunctional molecules that each define a different 
chemical structure and that each contain a unique 
3 0 identifier oligonucleotide whose nucleotide sequence 

defines the corresponding chemical structure. 



35 



1. Bifunctional Molecules 

A bifunctional molecule is the basic 
unit in a library of this invention, and combines the 
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elements of a polymer comprised of a series of 
chemical building blocks to form a chemical moiety in 
the library, and a code for identifying the structure 
of the chemical moiety. 
5 Thus, a bifunctional molecule can be represented 

by the formula A-B-C, where A is a chemical moiety, B 
is a linker molecule operatively linked to A and C, 
and C is an identifier oligonucleotide comprising a 
sequence of nucleotides that identifies the structure 
10 of chemical moiety A. 

a. Chemical Polymers 

A chemical moiety in a 
bifunctional molecule of this invention is represented 

15 by A in the above formula A-B-C and is a polymer 

comprising a linear series of chemical units 
represented by the formula (X n ) a/ wherein X is a 
single chemical unit in polymer A and n is a position 
identifier for X in polymer A. n has the value of l+i 

20 where i is an integer from 0 to 10, such that when n 

is 1, X is located most proximal to the linker (B) . 

Although the length of the polymer can vary, 
defined by a, practical library size limitations arise 
if there is a large alphabet size as discussed further 

25 herein. Typically, a is an integer from 4 to 50. 

A chemical moiety (polymer A) can be any of a 
variety of polymeric structures, depending on the 
choice of classes of chemical diversity to be 
represented in a library of this invention. Polymer A 

3 0 can be any monomer ic chemical unit that can be coupled 

and extended in polymeric form. For example, polymer 
A can be a polypeptide, oligosaccharide, glycolipid, 
lipid, proteoglycan, glycopeptide, sulfonamide, 
nucleoprotein, conjugated peptide (i.e., having 

35 prosthetic groups), polymer containing enzyme 
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10 



15 



20 



25 



30 



substrates, including transition state analogues, 
polypeptide analog polymers having an amide bond 
replacement moiety such as t|r[COH 2 ] , ^[COS] , \|r[COO] , or 
i[r[P0 2 NH] f and the like biochemical polymers. 
Exemplary is the polypep tide-based library described 
herein . 

Where the library is comprised of peptide 
polymers, the chemical unit X can be selected to form 
a region of a natural protein or can be a non-natural 
polypeptide, can be comprised of natural D-amino 
acids, or can be comprised of non-natural amino acids 
or mixtures of natural and non-natural amino acids. 
The non-natural combinations provide for the 
identification of useful and unique structural motifs 
involved in biological interactions. 

Non-natural amino acids include modified amino 
acids and L-amino acids, stereoisomer of D-amino 
acids, and any other compounds that can form an amide 
or pseudo-amide linkage. 

The amino acid residues described herein are 
preferred to be in the "L" isomeric form. "H" refers 
to the free amino group present at the amino terminus 
of a polypeptide. "OH" refers to the free carboxy 
group present at the carboxy terminus of a 
polypeptide. In keeping with standard polypeptide 
nomenclature, J. Biol. Chem. , 243:3552-59 (1969) and 
adopted at 37 C.F.R. §1. 822 (b) (2) ) , abbreviations for 
amino acid residues are shown in the following Table 
of Correspondence: 



TABLE OF CORRESPONDENCE 



SYMBOL 



AMINO ACID 



1-Letter 



3-Letter 



G 



Y 



Tyr 
Gly 



tyrosine 
glycine 
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F 


Phe 


phenylalanine 


M 


Met 


methionine 


A 


Ala 


alanine 


S 


Ser 


serine 


I 


He 


isoleucine 


L 


Leu 


leucine 


T 


Thr 


threonine 


V 


Val 


valine 


P 


Pro 


proline 


K 


Lys 


lysine 


H 


His 


histidine 


Q 


Gin 


glutamine 


E 


Glu 


glutamic acid 


W 


Trp 


tryptophan 


R 


Arg 


arginine 


D 


Asp 


aspartic acid 


N 


Asn 


asparagine 


C 


Cys 


cysteine 



20 The phrase "amino acid residue" is broadly 

defined to include the amino acids listed in the Table 
of Correspondence and modified and unusual amino 
acids, such as those listed in 37 C.F.R. §1.822 (b) (4) , 
and incorporated herein by reference. 

25 The polymer defined by chemical moiety A can 

therefore contain any polymer backbone modifications 
that provide increased chemical diversity. In 
building of a polypeptide system as exemplary, a 
variety of modifications are contemplated, including 

30 the following backbone structures: -NHN(R)CO~, 

-NHB(R)CO-, -NHC (RR 1 ) CO- , -NHC (-CHR) CO- , -NHC^CO-, 
-NHCH 2 CHRCO- , -NHCHRCH-,CO- , and lactam structures. 

In addition, amide bond modifications are 
contemplated including -C0CH 2 ~, -COS-, -CONR, -COO-, 

35 -CSNH-, -CH 2 NH-, -CH 2 CH 2 -, -CH 2 S-, -CH 2 SO- # -CH 2 S0 2 -, 
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-CH(CH 3 )S~, -CH=CH- , -NHCO-, -NHCONH— , -CONHO-, 
-CC=CH 2 )CH 2 -, -P0 2 'NH- / -P0 2 *CH 2 - / -P0 2 ~CH 2 N*-, and 
-S0 2 NH~- . 

5 b. Polymer Identifier Oligonucleotide 

An identifier oligonucleotide in a 
bifunctional molecule of this invention is represented 
by C in the above formula A-B-C and is an 
oligonucleotide having a sequence represented by the 

10 formula (Z n ) a , wherein Z is a unit identifier 

nucleotide sequence within oligonucleotide C that 
identifies the chemical unit X at position n. n has 
the value of 1+i where i is an integer from 0 to 10, 
such that when n is 1, Z is located most proximal to 

15 the linker (B) . a is an integer as described 

previously to connote the number of chemical unit 
identifiers in the oligonucleotide.- 

For example, a bifunctional molecule can be 
represented by the formula: 

20 X^XjX 2 X^— B— Z<| Z 2 Z^Z^ • 

In this example , the sequence of oligonucleotides Z T , 
Z 2 , Z 3 and Z 4 identifies the structure of chemical 
units X ir X 2 , X 3 and X 4 , respectively. Thus, there is 
a correspondence in the identifier sequence between a 

25 chemical unit X at position n and the unit identifier 

oligonucleotide Z at position n. 

The length of a unit identifier oligonucleotide 
can vary depending on the complexity of the library, 
the number of different chemical units to be uniquely 

30 identified, and other considerations relating to 

requirements for uniqueness of oligonucleotides such 
as hybridization and polymerase chain reaction 
fidelity. A typical length can be from about 2 to 
about 10 nucleotides, although nothing is to preclude 

35 a unit identifier from being longer. 



WO 93/20242 . ± 3 _ PCT/US93/03127 

Insofar as adenosine (A) , guanosine (G) , 
thymidine (T) and cytidine (C) represent the typical 
choices of deoxynucleotides for inclusion in a unit 
identifier oligonucleotide, A, G, T and C form a 
5 representative "alphabet" used to "spell" out a unit 

identifier oligonucleotide's sequence. Other 
nucleotides or nucleotide analogs can be utilized in 
addition to or in place of the above four nucleotides, 
so long as they have the ability to form Watson-Crick 
10 pairs and be replicated by DNA polymerases in a PCR 

reaction. However, the nucleotides A, G, T and C are 
preferred. 

For the design of the code in the identifier 
oligonucleotide, it is essential to chose a coding 
15 representation such that no significant part of the 

oligonucleotide sequence can occur in another 
unrelated combination by chance or otherwise during 
the manipulations of a bifunctional molecule in the 
library. 

20 For example, consider a library where Z is a 

trinucleotide whose sequence defines a unique chemical 
unit X. Because the methods of this invention provide 
for all combinations and permutations of an alphabet 
of chemical units, it is possible for two different 

25 unit identifier oligonucleotide sequences to have 
closely related sequences that differ by only a 
frame shift and therefore are not easily 
distinguishable by hybridization or sequencing unless 
the frame is clear. 

3 0 Other sources of misreading of a unit identifier 

oligonucleotide can arise. For example, mismatch in 
DNA hybridization, transcription errors during a 
primer extension reaction to amplify or sequence the 
identifier oligonucleotide, and the like errors can 

35 occur during a manipulation of a bifunctional 
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molecule . 

The invention contemplates a variety of means to 
reduce the possibility of error in reading the 
identifier oligonucleotide, such as to use longer 
5 nucleotide lengths for a unit identifier nucleotide 

sequence as to reduce the similarity between unit 
identifier nucleotide sequences. Typical lengths 
depend on the size of the alphabet of chemical units. 
A representative system useful for eliminating 
10 read errors due to frame shift or mutation is a code 

developed as a theoretical alternative to the genetic 
code and is known as the commaless genetic code. 
Crick et al., Proc. Natl. Acad. Sci. USA . 43:416-421 
(1957) . 

15 Where the chemical units are amino acids, a 

convenient unit identifier nucleotide sequence is the 
well known genetic code using triplet codons or the 
use of a commaless code as described by Crick et al., 
supra . The invention need not be limited by the 

20 translation afforded between the triplet codon of the 

genetic code and the natural amino acids; other 
systems of correspondence can be assigned. 

A typical and exemplary unit identifier 
nucleotide sequence is based on the commaless code 

25 described in the Examples, and having a length of six 
nucleotides (hexanucleotide) per chemical unit to 
provide a length that assures complementary 
hybridization. 

Preferably, an identifier oligonucleotide has at 

3 0 least 15 nucleotides in the tag (coding) region for 

effective hybridization. In addition, considerations 
of the complexity of the library, the size of the 
alphabet of chemical units, and the length of the 
polymer length of the chemical moiety all contribute 

35 to length of the identifier oligonucleotide as 
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discussed in more detail herein. 

In a preferred embodiment, an identifier 
oligonucleotide C has a nucleotide sequence according 
to the formula Pl-(Z n ) a ~P2, where PI and P2 are 
5 nucleotide sequences that provide polymerase chain 

reaction (PCR) primer binding sites adapted to amplify 
the polymer identifier oligonucleotide. The 
requirements for PCR primer binding sites are 
generally well known in the art, but are designed to 

10 allow a PCR amplification product (a PCR-amplif ied 

duplex DNA fragment) to be formed that contains the 
polymer identifier oligonucleotide sequences. 

The presence of the two PCR primer binding sites, 
PI and P2, flanking the identifier oligonucleotide 

15 sequence (2 n ) a provides a means to produce a PCR- 

amplified duplex DNA fragment derived from the 
bifunctional molecule using PCR. This design is 
useful to allow the amplification of the tag sequence 
present on a particular bifunctional molecule for 

20 cloning and sequencing purposes in the process of 

reading the identifier code to determine the structure 
of the chemical moiety in the bifunctional molecule. 

More preferred is a bifunctional molecule where 
one or both of the nucleotide sequences Pi and P2 are 

25 designed to contain a means for removing the PCR 

primer binding sites from the identifier 
oligonucleotide sequences. Removal of the flanking PI 
and P2 sequences is desirable so that their sequences 
do not contribute to a subsequent hybridization 

3 0 reaction. Preferred means for removing the PCR primer 

binding sites from a PCR amplification product is in 
the form of a restriction endonuclease site within the 
PCR-amplif ied duplex DNA fragment. 

Restriction endonucleases are well known in the 

35 art and are enzymes that recognize specific lengths of 
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duplex DNA and cleave the DNA in a s equence- specif ic 
manner . 

Preferably, the restriction endonuclease sites 
should be positioned proximal to CZ n ) a relative to the 
5 PCR primer binding sites to maximize the amount of PI 
and P2 that is removed upon treating a bifunctional 
molecule to the specific restriction endonuclease. 
More preferably, PI and P2 each are adapted to form a 
restriction endonuclease site in the resulting PCR- 

10 amplified duplex DNA, and the two restriction sites, 

when cleaved by the restriction endonuclease, form 
non-overlapping cohesive termini to facilitate 
subsequent manipulations • 

Particularly preferred are restriction sites that 

15 when cleaved provide overhanging termini adapted for 

termini-specific modifications such as incorporation 
of a biotinylated nucleotide (e.g., biotinyl deoxy- 
UTP) to facilitate subsequent manipulations. 

The above described preferred embodiments in an 

20 identifier oligonucleotide are summarized in a 

specific embodiment shown in Figure 1. 

In Figure 1, a PCR-amplif ied duplex DNA is shown 
that is derived from an identifier oligonucleotide 
described in the Examples. The (Z n ) sequence is 

25 illustrated in the brackets as the coding sequence and 

its complementary strand of the duplex is indicated in 
the brackets as the anticoding strand. The PI and P2 
sequences are shown in detail with a Sty I restriction 
endonuclease site defined by the PI sequence located 

3 0 5 1 to the bracket and an Apy I restriction 

endonuclease site defined by the P2 sequence located 
3* to the bracket. 

Step 1 illustrates the cleavage of the PCR- 
amplified duplex DNA by the enzymes Sty I and Apa I to 

35 form a modified identifier sequence with cohesive 
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termini. Step 2 illustrates the specific 
biotinylation of the anticoding strand at the Sty I 
site, whereby the incorporation of biotinylated UTP is 
indicated by a B. 
5 The presence of non-overlapping cohesive termini 

after Step 1 in Figure 1 allows the specific and 
directional cloning of the restriction-digested PCR- 
amplified fragment into an appropriate vector, such as 
a sequencing vector. In addition, the Sty I was 

10 designed into Pi because the resulting overhang is a 

substrate for a filling-in reaction with dCTP and 
biotinyl-dUTP (BTP) using DNA polymerase Klenow 
fragment. The other restriction site, Apa I, was 
selected to not provide substrate for the above 

15 biotinylation, so that only the anticoding strand can 

be biotinylated. 

Once biotinylated, the duplex fragment can be 
bound to immobilized avidin and the duplex can be 
denatured to release the coding sequence containing 

20 the identifier nucleotide sequence, thereby providing 

purified anticoding strand that is useful as a 
hybridization reagent for selection of related Coding 
strands as described further herein. 



25 c. Linker Molecules 

A linker molecule in a 
bifunctional molecule of this invention is represented 
by B in the above formula A-B-C and can be any 
molecule that performs the function of operatively 

3 0 linking the chemical moiety to the identifier 

oligonucleotide . 

Preferably, a linker molecule has a means for 
attaching to a solid support, thereby facilitating 
synthesis of the bifunctional molecule in the solid 

35 phase. In addition, attachment to a solid support 
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provides certain features in practicing the screening 
methods with a library of bifunctional molecules as 
described herein* Particularly preferred are linker 
molecules in which the means for attaching to a solid 
5 support is reversible, namely , that the linker can be 
separated from the solid support. 

A linker molecule can vary in structure and 
length, and provide at least two features: (1) 
operative linkage to chemical moiety A, and (2) 

10 operative linkage to identifier oligonucleotide C. As 

the nature of chemical linkages is diverse, any of a 
variety of chemistries may be utilized to effect the 
indicated operative linkages to A and to C, as the 
nature of the linkage is not considered an essential 

15 feature of this invention. The size of the linker in 

terms of the length between A and C can vary widely, 
but for the purposes of the invention, need not exceed 
a length sufficient to provide the linkage functions 
indicated. Thus, a chain length of from at least one 

20 to about 20 atoms is preferred. 

Preferred linker molecules are described in 
Example 3 herein that contains the added, preferred, 
element of a reversible means for attachment to a 
solid support. That is, the bifunctional molecule is 

25 removable from the solid support after synthesis. In 
another embodiment, the linker has the ability to be 
cleaved slowly over time to release the bifunctional 
molecule, or portions thereof. 

A variety of chemo- or enzyme-selective cleavage 

30 functionalities could be incorporated into a linker- 
support molecule useful in the present invention. For 
example, a 4-hydroxymethyl phenoxyacetic acid moiety 
provides an acid-cleavable linker. A 2-[(tert- 
buty ldipheny Is iloxy) methyl ]benzioc acid moiety 

35 provides a f luoride-cleavable moiety. The phosphate 
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of a 2-hydroxymethyl benzoic acid moiety provides a 
site cleavable by the combination of alkaline 
phosphatase treatment followed by mild alkaline 
treatment. Thus, the incorporation of selectively 
5 cleavable linkers other than those recited above are 

also considered a part of the invention. 

Solid supports for chemical synthesis are 
generally well known. Particularly preferred are the 
synthetic resins and controlled pore glass (CPG) 

10 supports used in oligonucleotide and in polypeptide 

synthesis that are available from a variety of 
commercial sources including Glen Research (Herndon, 
VA) , Bachem Biosciences (Philadelphia, PA) , Sigma 
Chemical Co. (St. Louis, MO), CPG Inc., (Fairfield, 

15 NJ) and Applied Biosystems (Foster City, CA) . Most 

preferred are teflon and CPG supports such as are 
described in Example 2. 

In a related embodiment, the invention describes 
a preferrred bifunctional solid support particularly 

20 suited for producing a bifunctional molecule of this 

invention having a polypeptide as the chemical moiety 
(i.e., an oligopeptide/ oligonucleotide conjugate). 

A preferred bifunctional solid support for 
synthesizing oligopeptide/oligonucleotide conjugates 

25 comprises a solid support, a first linkage unit, a 

second linkage unit, and a bifunctional unit. A 
"bifunctional unit" in this context is not to be 
confused with a bifunctional molecule of this 
invention, and refers instead to that chemical moiety 

30 present on a bifunctional solid support which provides 

the two (bi-) reactive functionalities, one for 
coupling the peptide and another for coupling the 
oligonucleotide. An exemplary bifunctional unit is 
shown in Figure 3 as the serine-branch monomer 

35 following the aminohexanol linker. 
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The solid support used herein is of a type which 
is dispersible in aqueous solution, such as were 
described above. A preferred solid support is a 
controlled pore glass of a type employable for peptide 
5 and oligonucleotide synthesis, e.g. aminopropyl-CPG 

supplied by Sigma. The solid support is tightly 
coupled to the first linkage unit. A preferred first 
linkage unit includes a sarcosine linker coupled to 
the aminopropyl-CPG and a succinyl linker coupled to 

10 the sarcosine linker by means of an amide bond. The 
second linkage unit is, in turn, coupled to the first 
linkage unit. A preferred second linkage unit is an 
aminohexanol group. A preferred bond for coupling the 
first and second linkage units is an alkyl ester. 

15 Alkyl esters are readily hydrolysed or cleaved upon 
exposure to concentrated aqueous ammonia. 

The bifunctional unit has a first leaving group 
employable for oligopeptide synthesis and a second 
leaving group employable for oligonucleotide 

20 synthesis. A preferred bifunctional unit is an L- 

serine residue. The L-serine residue includes an 
amino end, a carboxyl end, and a hydroxy! end. The 
serine residue is coupled at its carboxyl end to the 
second linkage unit. In a preferred embodiment, the 

25 carboxyl end of the serine residue is couple by means 

of an amide bond to an aminohexanol linker. The 
serine residue is also coupled at its amino end to the 
first leaving group and at its hydroxy 1 end to the 
second leaving group. A preferred first leaving group 

30 is N-FMOC [N- (9-f luorenylmethoxycarbonyl] or its 

functional equivalent. A preferred second leaving 
group is O— DMT (O-dimethoxytrityl) or its functional 
equivalent. 

The solid support, the first linkage unit, the 
35 second linkage unit, the cleavable bond therebetween, 
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and the bifunctional unit, exclusive of said first and 
second leaving groups, are each substantially 
chemically unreactive under conditions employed during 
conventional oligopeptide synthesis protocols using 
5 FMOC leaving groups [see: Bodanszky et al., in The 

practice of Peptide Synthesis, Springer-Verlag, 
(1984); and Bodanszky et al., in Principles of 
Peptide Synthesis, Springer-Verlag, (1984)] and under 
conditions employed during conventional 

10 oligonucleotide synthesis protocols using O-DMT 

leaving groups and phosphoramidite donors. 

In an alternative embodiment, the bifunctional 
solid support also includes a third linkage unit. The 
third linkage unit is interposed between and coupled 

15 to the bifunctional unit and the first leaving group. 

In a preferred mode of this embodiment, the third 
linkage unit is photosensitive, i.e. it is cleavable 
by exposure to ultra-violet light. The third linkage 
unit may include a 3-nitro-4-bromomethyl benzoate 

20 group coupled by means of an amide bond to the amino 

end of the bifunctional serine residue and coupled by 
means of an ester bond to an FMOC blocked amino acid. 



2. Libraries 

A library of this invention is a 
repertoire of chemical diversity comprising a 
plurality of species of bifunctional molecules 
according to the present invention. The plurality of 
species in a library defines a family of chemical 
diversity whose species each have a different chemical 
moiety. Thus the library can define a family of 
peptides, lipids, oligosaccarides or any of the other 
classes of chemical polymers recited previously. 

The number of different species in a library 
represents the complexity of a library and is defined 
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by the polymer length of the chemical moiety, and by 
the size of the chemical unit alphabet that can be 
used to build the chemical unit polymer. The number 
of different species referred to by the phrase 
5 "plurality of species" in a library can be defined by 

the formula V a , i.e., V to power of a (exponent a). V 
represents the alphabet size, i.e., the number of 
different chemical units X available for use in the 
chemical moiety. "a" is an exponent to V and 

10 represents the number of chemical units of X forming 
the polymer A, i.e., the length of polymer A. 

For example, for a bifunctional molecule where 
polymer A is a peptide having a length of 6 amino 
acids, and where the amino acids utilized can be any 

15 of the 20 natural amino acids, the alphabet (V) is 20 

and the polymer length (a) is 6, and the library size 
is 20 6 or 64 million. This exemplary library provides 
a repertoire of chemical diversity comprising 64 
million different hexameric polypeptides operatively 

20 linked to corresponding unique identifier 

oligonucleotides . 

Because the complexity of the library will 
determine the amount of a particular species of 
bifunctional molecule relative the other species in 

25 the library, there are theoretical limits to the 

maximum useful complexity in a library. Another 
limitation is the library size defined by the number ' 
of actual support molecules or beads present in the 
library. Therefore it is useful to consider how large 

30 (complex) a library should be. This size limit is 

dictated by the level of sensitivity for detecting the 
presence of a polymer identifier oligonucleotide after 
a screening procedure according to this invention. 
Detection sensitivity is dictated by the threshold of 

35 binding or catalytic activity between an acceptor 
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molecule to be assayed and a bifunctional molecule. 

If, for example, the binding threshold is 
(micromolar) , then there must be at least one nanomole 
of each species in a library of 1 milliliter (ml) 
5 volume. At this threshold, a library having a 

complexity of 10 4 could contain 10 micromoles of each 
species. Because of the reciprocal relationship 
between library complexity and binding threshold, more 
complex libraries are possible where the binding 

10 threshold is lower. 

The relative amounts of the individual 
bifunctional molecule species within the library can 
vary from about 0.2 equivalents to about 10 
equivalents, where an equivalent represents the 

15 average amount of a species within the library. 

Preferably each species is present in the library in 
approximately equimolar amounts. 

In a preferred embodiment, a library contains the 
complete repertoire of chemical diversity possible 

20 based on the mathematical combinations for a given 

library where there is a fixed alphabet and a 
preselected number of chemical units in all species of 
the library. Thus a complete repertoire is one that 
provides a source of all the possible chemical 

25 diversity that can be found in a library of this 

invention having a fixed alphabet and chemical length. 

It is particularly preferred that a library be 
comprised of bifunctional molecules where each species 
of bifunctional molecule contains the same nucleotide 

30 sequence for either the PI or P2 PGR primer binding 

sites. A library with this design is particularly 
preferred because, when practicing the methods of this 
invention, a single PCR primer pair can be used to 
amplify any species of identifier oligonucleotide 

35 (coding sequence) present in the library. 
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In a related embodiment, the invention 
contemplates an element of the library, namely a 
bifunctional molecule of this invention. A preferred 
library element is an oligopeptide/oligonucleotide 
5 conjugate as described earlier. 

Thus, an element of a library of 
oligopeptide/oligonucleotide conjugates in one 
embodiment comprises a solid support, a first linkage 
unit, a second linkage unit, a bifunctional unit, an 

10 oligopeptide, and an oligonucleotide. The solid 

support is of a type which is dispersible in aqueous 
solution. The first linkage unit is coupled to the 
solid support. The second linkage unit is coupled to 
the first linkage unit. The bifunctional unit is 

15 coupled to the second linkage unit. The oligopeptide 

and the oligonucleotide are attached to the 
bifunctional unit. 

In an alternative embodiment, the library element 
includes a cleavable bond for coupling the first 

20 linkage unit to the second linkage unit. The 

cleavable bond is of a type which is cleavable by 
exposure to concentrated aqueous ammonia. 

In another alternative embodiment, the library 
element includes a cleavable bond for coupling the 

25 bifunctional unit to the oligopeptide. The cleavable 

bond may be of a type which is cleavable by exposure 
to ultraviolet light, e.g. a 3-nitro-4-bromomethyl 
benzoate group coupled by means of an amide bond to 
the amino end of the bifunctional serine residue and 

30 coupled by means of an ester bond to an FMOC blocked 

amino acid. 

The library element may also be made without the 
solid support and the first and/ or second linkage 
units. In this instance, the library element 
35 comprises a bifunctional unit, an oligopeptide 
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attached to said bifunctional unit, and an 
oligonucleotide attached to said bifunctional unit. 
An alternative to this embodiment includes a cleavable 
bond for coupling the bifunctional unit to the 
5 oligopeptide. The cleavable bond may be of a type 

which is photosensitive, i.e. cleavable by exposure to 
ultraviolet light, as indicated above. 



B . Methods for Producing a Library 
10 The present method for producing a plurality 

of bifunctional molecules to form a library of this 
invention solves a variety of problems regarding 
efficient synthesis of large numbers of different 
species. 

15 In the present synthesis methods, the sequential 

steps of first adding a chemical unit X followed by 
the addition of an oligonucleotide sequence to the 
linker molecule requires an alternating parallel 
synthesis procedure to add chemical unit X and then 

20 add a unit identifier nucleotide sequence Z that 

defines (codes for) that corresponding chemical unit. 
The library is built up by the repetition of this 
alternating parallel process after pooling and 
division of the reaction products as described herein. 

25 The only constraint for making an encoded library 

is that there must be compatible chemistries between 
the two alternating syntheses procedures for adding a 
chemical unit as compared to that for adding a 
nucleotide or oligonucleotide sequence. 

30 The problem of synthesis compatibility is solved 

by the correct choice of compatible protecting groups 
as the alternating polymers are synthesized, and by 
the correct choice of methods for deprotection of one 
growing polymer selectively while the other growing 

35 polymer remains blocked, such as by the use of 
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transient protection groups, or the use of protection 
groups removable under specific deprotection 
chemistry. Suitable compatible protection chemistries 
are described herein, and additional suitable 
5 chemistries are also described by the chemical 

protecting group database, available from Synopsys, 
Inc. , or described by Greene et al., in "Protective 
Groups in Organic Synthesis", 2nd Ed., John Wiley & 
Sons (1991) . 

10 The synthesis of a library having a plurality of 

bifunctional molecules comprises the following steps: 

(1) A linker molecule is provided that has 
suitable means for operatively linking the first 
chemical unit x T and for operatively linking the first 

15 nucleotide sequence defining a unit identifier 

nucleotide Z A whose sequence codes for (defines) the 
structure of chemical unit X r Preferably the linker 
has a means for attachment to a solid support, and as 
such allows for the synthesis to proceed in the solid 

20 phase. 

Thus the provided linker molecule has a structure 
A r -B-C f , where A 1 represents a termini adapted for 
reaction to operatively link a chemical unit X in 
precursor form (X 1 ), and C« represents a termini 

25 adapted for reaction to operatively link a nucleotide 
or polymer identifier oligonucleotide Z in precursor 
form (Z 1 ). The termini A 1 and C» are protected by 
respective blocking groups so that during operative 
linking reactions at one termini, the other termini is 

30 protected from reaction. 

(2) The linker molecule is then subjected 
to a first cycle of synthesis to add a building block 
at one termini. The order of synthesis is not 
generally important insofar as one may elect to add a 

35 chemical unit X first to termini A 1 , or add an 
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identifier oligonucleotide Z first to termini C. A 
first cycle involves the steps of deprotecting the 
termini of the linker to which a building block is to 
be added and then adding the building block to the 
5 termini- Typically, the added building block contains 

a blocking group at its free termini, i.e., the 
termini that will participate in an addition of the 
next building block of its type. The linker molecule 
is then subjected to a second cycle of synthesis to 

10 add a building block at the other (second) termini. A 

second cycle involves the steps of deprotecting the 
second termini of the linker to which a building block 
is to be added and then adding the building block to 
the termini. Again, the added building block is 

15 typically blocked at its free termini. 

The addition of identifier oligonucleotide Z to 
termini C 1 can be conducted either nucleotide by 
nucleotide to form the complete unit identifier 
nucleotide sequence Z, or Z can be presynthesized, and 

20 the oligonucleotide Z added as a block to termini C 1 . 

Insofar as the synthesis of oligonucleotides is well 
known in the arts, the presynthesis of 
oligonucleotides, and their addition to the growing 
nucleotide polymer in blocks is preferred because it 

25 reduces the number of manipulations in synthesizing a 

bifunctional molecule. 

A chemical unit X or a unit identifier 
oligonucleotide Z is referred to as a precursor (X 1 or 
Z 1 ) to indicate that it contains a leaving group 

3 0 compatible with the reaction chemistry that 

facilitates the precursor's operative linkage to the 
growing polymer at the appropriate termini. 

The product resulting from step (2) is a 
bifunctional molecule having the structure A'-O^-B-Z^ 

35 C, and is ready for a repetition of the above first 
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and second cycles to add X 2 and Z 2 to the growing 
polymers . 

(3) After the bifunctional molecule product A 1 - 
X^-B-Z^-C 1 is formed, aliquots of the product are 
5 made, and the cycles in step (2) are repeated on each 

aliquot , with the exception that a different species 
of X (and its corresponding Z) is added in each 
different aliquot- The reaction product in each 
aliquot has the structure A , -X 2 -X 1 -B-Z 1 -Z 2 -C l . 

10 (4) The aliquots each containing the product A 1 - 

X 2 -X 1 -B--Z 1 -Z 2 -C ! are combined (pooled) to form a 
mixture of different bifunctional molecules , and the 
mixture is divided into aliquots. The cycles in step 
(2) are repeated again on each aliquot, with different 

15 X and Z building blocks being added to each aliquot to 

form the bifunctional molecule product A- , X 3 -X 2 -X 1 -B- 
Z|~Z 2 ~Z2™C * • 

The process of pooling, aliquot ing and adding a 
next set of building blocks X and Z can be repeated at 

20 positions n=4, 5, 6 ... and so on depending on the 
length of polymers desired- As the cycles are 
repeated, and the polymers grow in length, the 
complexity of the resulting library also increases. 
For each cycle, the polymer length a increases by one 

25 and the library complexity therefore increases 
exponentionally according to the formula V a . In 
preferred embodiments, the cycles are repeated from 
about 1 to 10 times. 

In a related embodiment the provided linker in 

30 step (1) is first divided into aliquots, and the 
cycles of step (2) are conducted on each aliquot 
adding a different X and corresponding Z to the linker 
in each different aliquot. The aliquots are then 
pooled as before, and the cycles of step (2) can be 

35 repeated on one or more aliquots. 
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Thus the steps of (i) dividing a linker or pool 
into aliquots, (ii) parallel addition of X and Z to 
the linker substrate in separate aliquots, and (iii) 
pooling of the aliquots, can be cycled (repeated) to 
5 sequentially add the chemical units and their 

corresponding unit identifier oligonucleotides to form 
the library comprising a plurality of bifunctional 
molecules each having a different chemical polymer 
operatively linked through the linker to a 

10 corresponding identifier oligonucleotide. 

In a preferred embodiment, a method for forming a 
library of this invention includes the steps for 
addition of the PCR primer binding sites PI and P2 to 
each of the bifunctional molecules in the library. 

15 The method is substantially the same as above, 

but includes the addition of a series of nucleotides 
or a presynthesized PI oligonucleotide to the linker 
molecule provided in step (1) prior to the cycles of 
step (2) that add X and 2. Because all members of the 

20 library are to contain the same PI sequence, PI is 

added to the c 1 termini of linker molecule A'-B-C' 
prior to dividing the linker into aliquots and 
subjecting the aliquots to the cycles of step (2) 
adding X 1 and Z-,. The resulting product has the 

25 formula A'-B-Pl-C 1 , 

Thereafter, the product is aliquoted and cycled 
as before, resulting in the preparation of the product 
A l -(X n ) a -B-Pl-(Z n ) a -G l , where a indicates the presence 
of a polymer of length "a". 

30 Next, the pooled admixture containing product 

A, ~< X n>a~ B ~ pl ~( Z n>a~ cf is subjected to the addition of 
a series of nucleotides or a presynthesized 
oligonucleotide P2 at termini C f to form the product 
A , -(X n ) a -B^Pl-(Z n ) a -P2-C , • Thus all members of the 
35 library contain a common sequence PI and a common 
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sequence P2 from which universal PCR reactions can be 
conducted,, regardless of the species of bifunctional 
molecule present from the library. 

In a related embodiment, the method for producing 
5 a library of this invention utilizes a bf-CPG solid 

support, as described herein 

1. Polypeptide Libraries 

In one preferred embodiment, the 

10 invention contemplates a library, and methods of 

producing the library, where the bifunctional molecule 
has a polypeptide for polymer A. 

In this embodiment, the compatible chemistries 
for sequentially adding amino acids and 

15 oligonucleotides to the growing polymers has been 

developed for the synthesis of an amino acid polymer 
in the direction of carboxy to amino terminus, and 
alternatively in the direction of amino to carboxy 
terminus. Chemistries have also been developed for 

20 the synthesis of an oligonucleotide polymer in the 
direction of 3 1 to 5 1 , and alternatively in the 
direction of 5 1 to 3'. In addition r in each of these 
syntheses it is preferred that the amino acid side 
chains (R groups) be blocked for certain amino acid 

25 residues where the R group provides an otherwise 

reactive termini during one of the synthesis or 
deblocking steps. 

Each type of chemistry will be described in 
detail herein below. 

30 For any of the syntheses, the reactive side 

chains of several amino acids must be blocked. Table 
1 below lists those of the natural amino acids which 
have an R group that preferably contains a blocking 
group. Any compatible protecting (blocking) group may 

35 be utilized, and the invention is not to be so limited 
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to any particular blocking group. Also indicated in 
Table 1 are preferred blocking groups. 

TABLE 1 

5 



Amino Acid 


Blockina Group 


Arginine 


N-MTr 1 , N-PMC 7 


Histidine 


N*-Bum 2 , SEM 10 , FMOC, DNP 


Cysteine 


S-Trt 3 , Acm 11 , S-t-butyl 


Tryptophan 


N'-CHO, none 


Tyrosine 


O-TBS 4 


Aspartic acid 


O-TSE 5 , DMB 12 


Glutamic acid 


O-TSE 5 , DMB 


Serine 


O-TBS 4 


Threonine 


O-TBS 4 


Lysine 


N-Bz 6 , TFA 8 , TEOC 9 


Asparagine 


none 


Glutamine 


none 


Glycine 


none 


Phenylalanine 


none 


Methionine 


none 


Alanine 


none 


Isoleucine 


none 


Leucine 


none 


Valine 


none 


Proline 


none 



1 MTr is N 9 -4-methoxy-2,3,6-triinethylbenzene 
sulf onyl. 

30 2 Bum is tert-butoxymethy 1 • 

3 Trt is triphenylmethyl • 

5 TSE is trimethylsylilethylester. 

4 TBS is tert-butyl-dimethylsilylester. 

6 Bz is benzyl. 

35 7 PMC is N G -2,2,5,7, 8-pentamethylchroirton-6- 



WO 93/20242 



32- 



PCT/US93/03127 



sulphonyl . 

TFA is trif luoroacetyl. 

TEOC is B-(trlmethyl silyl) ethoxycarbonyl. 
SEM is B- (trimethyl silyl) ethoxymethyl . 
Acm is acetamidomethyl. 
DMB is dimethoxybenzyl . 

Protected amino acids suitable as a blocked 
precursor for addition to a bifunctional molecule can 
10 be obtained from a variety of commercial vendors 

including Bachem Biosciences Inc. (Philadelphia, PA) , 
Peninsula Labs (CA) , and Nova Biochem (CA) . In 
addition, the preparation of protected amino acids is 
described at least in Example 1. 

15 

a. Polypeptide Synthesis 

For synthesis of a polypeptide on 
the linker substrate in the direction of carboxy to 
amino terminus, a free amino terminus on the linker is 
20 required that can be conveniently blocked and 

deblocked as needed. A preferred amino terminus 
blocking group is a f luorenylme thoxy car bony 1 group 
(FMOC) • 

FMOC blocked amino termini are deblocked using 
25 20% (v/v) piper idine in dimethylf ormamide (DMF) , ©r 
using l,8-diazatricyclo[5,4, 0]undec-7-ene (DBU) in 
dichloromethane (DCM) as is well known for polypeptide 
synthesis. The amino acid units are added in the form 
of blocked amino acids having FMOC blocked amino 
30 termini and a carboxy 1 terminus blocked with 

pentafluor ©phenyl ester (Opfp) or by the activation of 
the free acid using BOP, HBTU, TBTU or preferebly with 
pyBOP. BOP is benzotriazole-l-yl-oxy- 
tris (dimethylamino) -phosphonium hexaf luorophosphate 
35 (Castro's reagent). HBTU is 2- (IH-benzotriazole-l- 



8 
9 
10 

it 

12 
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yl) -1/ 1, 3 , 3-tetramethyluronium hexaf luorophosphate. 
TBTU is 2- (lH-benzotriazole-l-yl) -1, 1,3,3- 
tetramethyluronium tetraf luoroborate. pyBOP is 
bensotriazol-l-yloxytris[dimethylamino phosphonium 
5 hexaf luorophosphate. The addition reaction requires 

the blocked amino acid, dimethylf ormamide (DMF) and 
hydroxy-benzotriazole (HOBt) as is well known for 
peptide synthesis. The resulting product contains an 
added amino acid residue with a FMOC-blocked amino 

10 terminus, ready for deblocking addition of a 

subsequent blocked amino acid as before . 

For synthesis of a polypeptide on the linker 
substrate in the direction of amino to carboxy 
terminus, a free carboxy terminus on the linker is 

15 required that can be conveniently blocked and 

deblocked as needed. A preferred carboxy terminus 
blocking/ activating group is the Opfp ester described 
before. A carboxy terminus on the linker is produced 
by reacting a linker with a free amino terminus with 

2 0 succinamide in HOBT and a proton catalyst* 

Thereafter, the terminus can be modified by reaction 
with pentaf luorophenol in dicyclohexylcarbodiiirtide 
(DCC) and ethanol acetate to form an Opfp ester at the 
free carboxy terminus. The Opfp ester is blocked 

25 linker terminus is available for addition reaction 

with a FMOC-, Opfp- blocked amino acid as before, but 
with the amino acid adding to the linker in the 
reverse direction. The resulting product contains an 
added amino acid residue with an Opfp-blocked 

30 terminus, ready to repeat the addition with a 

subsequent blocked amino acid. 



35 



b. Oligonucleotide Synthesis 
Oligonucleotides can be 
synthesized by a variety of chemistries as is well 
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known. An excellent review is "Oligonucleotide 
Synthesis: A Practical Approach", ed. M.J. Gait, IRL 
Press, Oxford, (1984) . Preferred oligonucleotide 
synthesis can be carried out on any of a variety of 
5 automated DNA synthesizing machines such as are 

available from Applied Biosystems Inc., (ABI; Foster 
City, CA) . Additional excellent references for 
oligonucleotide synthesis and DNA synthesis machines 
and technology include Caruthers, Science , 230:281-285 
10 (1985); and Caruthers, Acc. Chem. Res. . 24:278-284 

(1991) . 

For synthesis of an oligonucleotide on the linker 
substrate in the direction of 3 1 to 5 1 , a free hydroxy 
terminus on the linker is required that can be 

15 conveniently blocked and deblocked as needed, A 

preferred hydroxy terminus blocking group is a 
dimethoxytrityl ether (DMT) . DMT blocked termini are 
first deblocked, such as by treatment with 3% 
dichloroacetic acid in dichloromethane (DCM) as is 

20 well known for oligonucleotide synthesis, to form a 
free hydroxy terminus. 

Nucleotides in precursor form for addition to a 
free hydroxy terminus in the direction of 3 1 to 5 1 
require a phosphor amidite moiety having an diisopropyl 

25 phosphoramidite at the 3 1 terminus of a nucleotide. 
In addition, the oxygen of the phosphoramidite is 
blocked with a cyanoethyl group (CNE) , and the 5 1 
terminus is blocked with a DMT ether . 

The addition of a 5 1 DMT- , 3 1 CNE- blocked 

30 nucleoside phosphoramidite to a free hydroxy 1 requires 

tetrazole in acetonitrile followed by (iodine) 
oxidation and capping of unreacted hydroxyls with 
acetic anhydride, as is well known for oligonucleotide 
synthesis. The resulting product contains an added 

35 nucleotide residue with a DMT blocked 5* terminus, 
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ready for deblocking and addition of a subsequent 
blocked nucleotide as before. 

For synthesis of an oligonucleotide on the linker 
in the direction of 5 1 to 3 1 , a free hydroxy terminus 
5 on the linker is required as before- However, the 

blocked nucleotide to be added has the blocking 
chemistries reversed on its 5 1 and 3 ■ termini to 
facilitate addition in the opposite orientation. 

A nucleotide with a free 3 1 hydroxyl and 5' DMT 
10 ether is first blocked at the 3' hydroxy terminus by 

reaction with TBS-C1 in imidazole to form a TBS ester 
at the 3' terminus. Then the DMT- blocked 5' terminus 
is deblocked with DCA in DCM as before to form a free 
5' hydroxy terminus. The reagent (N, Il- 
ls diisopropylamino) (cyanoethyl) phosphonamidic chloride 
is reacted in tetrahydrof uran (THF) with the 5' 
deblocked nucleotide to form the aminodiisopropyl-, 
CNE- blocked phosphoramidite group on the 5' terminus. 
Thereafter the 3' TBS ester is removed with 
20 tetrabutylammonium fluoride (TBAF) in DCM to form a 

nucleotide with the phosphoramidite-blocked 5' 
terminus and a free 3» hydroxy terminus. Reaction in 
base with DMT-C1 adds a DMT ether blocking group to 
the 3 1 hydroxy terminus. 
25 The addition of the 3» DMT-, 5» CNE- blocked 

phosphoramidited nucleotide to a linker substrate 
having a free hydroxy terminus then proceeds using the 
previous tetrazole catalyzed reaction, as is well 
known for oligonucleotide synthesis. The resulting 
30 product contains an added nucleotide residue with a 

DMT-blocked 3' terminus, ready for deblocking with DCA 
in DCM and the addition of a subsequent blocked 
nucleotide as before. _ 

The above demonstrates that the present 
35 bifunctional molecules can be synthesized having 
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polypeptide (X) a in either orientation and having the 
polymer identifier oligonucleotide (Z) a in either 
orientation. Exemplary is the synthesis described 
herein in detail to form a library of bifunctional 
5 molecules having the oligonucleotide attached to 

linker through its 3» terminus and having the peptide 
attached to linker through its carboxy terminus. 

In one preferred embodiment, the order of 
synthesis orients the polypeptide on the linker such 
10 that after addition an added amino acid has a free 

amino terminus, that is, the polymer is assembled in 
the direction from carboxy to amino terminus. 
Exemplary chemistry for this synthesis is described in 
the Examples. 

15 The addition of oligonucleotides rather than 

single nucleotides to the growing polymer identifier 
nucleotide sequence is an alternate embodiment that is 
preferred because it affords more rapid and modular * 
assembly of the library. Although the previous 

20 synthesis discussions involved single nucleotide base 
units, the same blocking groups and addition 
chemistries apply where an oligonucleotide is to be 
added . 

The synthesis of a oligonucleotide having 5» CNE- 
25 blocked and 3' DMT-blocked termini or having 3' CNE- 
blocked and 5' DMT-blocked termini can readily be 
prepared using the oligonucleotide synthesis methods 
presently available and described herein. 

After synthesis of a bifunctional molecule, or 
library of molecules, the blocking groups at termini 
and at amino acid side chains are removed. Because of 
the relative lability of termini, it is preferred that 
the order of deblocking be selected as to preserve the 
functionalities, particularly the side chain 
35 functionalities. 



30 
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In the present preferred embodiment for a 
polypeptide library, the following sequence of 
deprotecting is preferred when using the teflon 
support and 5 f BCM3 linker described in Examples 2-3: 
5 1) tetrabutyl ammonium fluoride (TBAF) 

treatment to remove TBS and TMS ethyl ethers; 

2) brief 5 minute treatment with 

trif luoroacetic acid (TFA) treatment sufficient to 
remove MTr, Bum, PMC and Trt groups, followed by 
10 neutralization for 5 minutes with triethylamine; 

3) aqueous ammonia treatments to remove Bz and 
CNE groups; and 

4) cleavage of the bifunctional molecule from 
the solid support using a periodate oxidation. 

15 Alternatively, the following sequence of 

deprotection is used on the CPG linker described in 
Example 3 either with or without the photoactively 
cleavable linker moiety: 

1) TBAF treatment to remove TBS and TMS ethyl 

2 0 ethers; and 

2) aqueous ammonia treatment to remove Bz and 
CNE groups. 

As indicated, after the library has been 
synthesized, and after the protecting groups have been 

25 removed, the bifunctional molecules may be cleaved off 

of the solid support, and the released bifunctional 
molecules separated from the solid phase to form a 
solution comprising a plurality of bifunctional 
molecules. Alternatively, the library may be 

30 maintained in the form of a plurality of bifunctional 

molecules in the solid phase. 

Although natural amino acids are used in the 
Examples, the present invention is not to be so 
limited. The alphabet of possible amino acid residues 

3 5 can be extended to include any molecule that satisfies 
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the basic chemistry defining an amino acid, namely 
carboxyl and amino termini. Upon polymerization, an 
amide bond is formed. Thus the possible amino acids 
can include L-amino acids , D-amino acids, natural 
5 amino acids, non-natural amino acids, and derivatives >= 

thereof, including pharmaceutical ly active molecules. 

In addition, there is no basis to limit the ~ 
polypeptide backbone connecting the termini to the 
conventional amino acid structure. The amino and 

10 car boxy lie acid moieties can be on any backbone having 

any side group substituents , so long as the side 
groups are properly blocked as described herein. 
Previously undescribed amino acids may be developed 
that can be used in the present invention, having 

15 heteroatomic configurations, including unusual 

heterocyclic rings, such as thiazole-alanine or purine 
alanine. 

The development and use of both conventional and 
unusual amino acid structures provide a greater 

20 diversity of chemical moieties for a library of this 
invention. Such libraries allow the exploration by 
the screening methods of this invention of new 
combinations of important core chemical structures. 
Typical backbones can be alkyl chains of (CH 2 ) n 

25 where n can be from 1 to at least 6/ In addition, the 
alphabet can comprise amino acids of varying backbone 
structures such as a, B or y amino acids. Alphabets 
can also comprise amino acids where the number of 
carbon atoms and their configuration in the backbone 

30 can be varied. 



C. Methods for Identifying Chemical Structures 
The library of this invention provides a 
repertoire of chemical diversity such that each 
35 chemical moiety is linked to a genetic tag that 
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facilitates identification of the chemical structure. 

By the present screening methods, one can 
identify optimized chemical structures that 
participate in binding interactions or chemical 
5 catalysis events with a biologically active molecule 

by drawing upon a repertoire of structures randomly 
formed by the combinatorial association of diverse 
chemical units without the necessity of either 
synthesizing them one at a time or knowing their 

10 interactions in advance. 

The invention therefore also contemplates a 
method for identifying a chemical structure that 
participates in a preselected binding or catalysis 
interactions between the chemical structure and a 

15 biologically active molecule. The chemical structure 

to be identified is represented by one of the members 
of a library of this invention, and the method 
comprises the following steps: 

(1) A library according to the present invention 
20 is admixed with a preselected biologically active 

molecule under binding conditions (i.e., a binding 
reaction admixture) for a time period sufficient for 
the biologically active molecule to interact with at 
least one bifunctional molecule of this invention 
25 present in the library and form a binding reaction 

complex. 

(2) The binding reaction complex is then 
isolated from the library admixture to form an 
isolated complex. 

30 (3) The nucleotide sequence of the polymer 

identifier oligonucleotide present in the isolated 
binding reaction complex is determined. The 
nucleotide sequence provides a code that defines the 
chemical structure that participated in the binding 

35 reaction, and thus determining that sequence 
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identifies the chemical structure that participates in 
the binding reaction with the biologically active 
molecule . 

Where catalysis is the activity to be identified, 
5 a reactant can be identified indicating the occurrence 
of a catalytic event, and the causative solid support 
in the library is selected as the candidate catalytic 
molecule . 

A typical biologically active molecule exhibiting 

10 a preselected binding interaction or catalytic 

reaction can be any of a variety of molecules that 
bind selectively to and/ or react with another 
molecule, including antibodies to antigens, lectins to 
oligosaccharides , receptors to ligands, enzymes to 

15 substrates and the like mediators of molecular 

interactions and can be catalytic molecules like 
proteases. Therefore, a preselected binding 
interaction is defined by the selection of the 
biologically active molecule with which a library 

20 member is to bind. Similarly, a preselected catalytic 
activity is defined by selection of a substance with 
which a library member is catalytically active. The 
binding reaction will be discussed as exemplary, 
although it is understood that the procedures can be 

25 readily adapted for use to detect catalytic polymers. 



1. Binding Reaction Admixtures 

The admixture of a library of the 
invention with a biologically active molecule can be 
30 in the form of a heterogeneous or homogeneous 

admixture. Thus, the members of the library can be in 
the solid phase with the biologically active molecule 
present in the liquid phase* Alternatively, the 
biologically active molecule can be in the solid phase 
35 with the members of the library present in the liquid 
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phase. Still further, both the library members and 
the biologically active molecule can be in the liquid 
phase . 

Binding conditions are those conditions 
5 compatible with the known natural binding function of 

the biologically active molecule. Those compatible 
conditions are buffer, pH and temperature conditions 
that maintain the biological activity of the 
biologically active molecule, thereby maintaining the 

10 ability of the molecule to participate in its 

preselected binding interaction. Typically, those 
conditions include an aqueous, physiologic solution of 
pH and ionic strength normally associated with the 
biologically active molecule of interest. 

15 For example, where the binding interaction is to 

identify a member in the library able to bind an 
antibody molecule, the preferred binding conditions 
would be conditions suitable for the antibody to 
immunoreact with its immunogen, or a known 

20 immunoreact ing antigen. For a receptor molecule, the 

binding conditions would be those compatible with 
measuring receptor- ligand interactions. 

A time period sufficient for the admixture to 
form a binding reaction complex is typically that 

25 length of time required for the biologically active 

molecule to interact with its normal binding partner 
under conditions compatible with interaction. 
Although the time periods can vary depending on the 
molecule, admixing times are typically for at least a 

3 0 few minutes, and usually not longer than several 

hours, although nothing is to preclude using longer 
admixing times for a binding reaction complex to form. 

A binding reaction complex is a stable product of 
the interaction between a biologically active molecule 

35 and a bifunctional molecule of this invention. The 
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product is referred to as a stable product in that the 
interaction is maintained over sufficient time that 
the complex can be isolated from the rest of the 
members of the library without the complex becoming 
5 significantly disassociated. 



2. Isolation of a Bifunctional Molecule 
from the Binding Reaction Admixture 
A binding reaction complex is isolated 
10 from the binding reaction admixture by any separation 

means that is selective for the complex, thereby 
isolating that species of bifunctional molecule which 
has bound to the biologically active molecule. There 
are a variety of separation means, depending on the 
15 status of the biologically active molecule. 

For example, the biologically active molecule can 
be provided in admixture in the form of a solid phase 
reagent, i.e., affixed to a solid support, and thus 
can readily be separated from the liquid phase, 
20 thereby removing the majority of species of 

bifunctional molecule. Separation of the solid phase 
from the binding reaction admixture can optionally be 
accompanied by washes of the solid support to rinse 
bifunctional molecules having lower binding affinities 
25 off of the solid support. 

Alternatively, for a homogeneous liquid binding 
reaction admixture, a secondary binding means specific 
for the biologically active molecule can be utilized 
to bind the molecule and provide for its separation 
30 from the binding reaction admixture. 

For example, an immobilized antibody 
immunospecif ic for the biologically active molecule 
can be provided as a solid phase-affixed antibody to 
the binding reaction admixture after the binding 
35 reaction complex is formed. The immobilized antibody 
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immunoreacts with the biologically active molecule 
present in the binding reaction admixture to form an 
antibody-biologically active molecule immunoreaction 
complex. Thereafter, by separation of the solid phase 
5 from the binding reaction admixture, the 

immunoreaction complex, and therefor any binding 
reaction complex, is separated from the admixture to 
form isolated bifunctional molecule. 

Alternatively, a binding means can be operatively 

10 linked to the biologically active molecule to 

facilitate its retrieval from the binding reaction 
admixture. Exemplary binding means are one of the 
following high affinity pairs: biotin-avidin, protein 
A-Fc receptor, f err it in-magnetic beads, and the like. 

15 Thus, the biologically active molecule is operatively 

linked (conjugated) to biotin, protein A, ferritin and 
the like binding means, and the binding reaction 
complex is isolated by the use of the corresponding 
binding partner in the solid phase, e.g., solid-phase 

20 avidin, solid-phase Fc receptor, solid phase magnetic 

beads and the like. 

The use of solid supports on which to operatively 
link proteinaceous molecules is generally well known 
in the art. Useful solid support matrices are well 

25 known in the art and include cross-linked dextran such 
as that available under the tradename SEPHADEX from 
Pharmacia Fine Chemicals (Piscataway, NJ) ; agarose, 
borosilicate, polystyrene or latex beads about 1 
micron to about 5 millimeters in diameter, polyvinyl 

30 chloride, polystyrene, cross-linked poly aery lamide, 
nitrocellulose or nylon-based webs such as sheets, 
strips, paddles, plates microtiter plate wells and the 
like insoluble matrices. 



35 
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3 . Determining the Identifier Sequence 
The nucleotide sequence of the 
identifier oligonucleotide present in the isolated 
bifunctional molecules is determined to identify the 
5 species of chemical moiety that participated in the 

preselected binding interaction or catalytic reaction. 



Although conventional DNA sequencing methods are 
readily available and useful for this determination, 

10 the amount and quality of isolated bifunctional 

molecule may require additional manipulations prior to 
a sequencing reaction. 

Where the amount is low, it is preferred to 
increase the amount of the identifier oligonucleotide 

15 by polymerase chain reaction (PCR) using PCR primers 
directed to the primers PI and P2 present in the 
identifier oligonucleotide. 

In addition, the quality of the isolated 
bifunctional molecule may be such that multiple 

20 species of bifunctional molecule are co- isolated by 

virtue of similar capacities for binding to the 
biologically active molecule. In cases where more 
than one species of bifunctional molecule are 
isolated, the different isolated species must be 

25 separated prior to sequencing of the identifier 

o 1 igonucleot ide . 

Thus in one embodiment, the different identifier 
oligonucleotides of the isolated bifunctional 
molecules are cloned into separate sequencing vectors 

30 prior to determining their sequence by DNA sequencing 
methods. This is typically accomplished by amplifying 
all of the different identifier oligonucleotides by 
PCR as described herein, and then using the unique 
restriction endonuclease sites on the amplified 

35 product as shown in Figure 1 to direct ionally clone 



WO 93/20242 _ 4 5 _ PCT/US93/03127 

the amplified fragments into sequencing vectors. The 
cloning and sequencing of the amplified fragments then 
is a routine procedure that can be carried out by any 
of a number of molecular biological methods known in 
5 the art . 

Alternatively, PCR amplified products derived 
from a population of isolated bifunctional molecules 
can be used as a hybridization probe to selectively 
enrich the quality of the isolated bifunctional 

10 molecules. For example, using the hybridization 

probes, which are modified by biotinylation as shown 
in Figure 1, one can isolate members of the library by 
hybridization, to form an enriched library containing 
only bifunctional molecules that have sequences that 

15 hybridize to the above hybridization probes. In a 

second screening reaction under different binding 
conditions, for example, higher stringency binding 
conditions, one can isolate the species of 
bifunctional molecule that binds most tightly with the 

20 biologically active molecule. 

Thus the library can be manipulated to form 
enriched libraries from which to screen for chemical 
diversity. 

25 4. Polymerase Chain Reaction 

For determining the nucleotide sequence 
of the identifier oligonucleotide in the isolated 
complex as part of the methods of this invention, the 
use of the polymerase chain reaction (PCR) is a 

30 preferred embodiment. 

For use in this invention, the identifier 
oligonucleotide are comprised of polynucleotide coding 
strands, such as mRNA and/ or the sense strand of 
genomic DNA. If the genetic material to be assayed is 

35 in the form of double stranded DNA, it is usually 
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first denatured, typically by melting, into single 
strands. The nucleic acid is subjected to a PGR 
reaction by treating (contacting) the sample with a 
PGR primer pair, each member of the pair having a 
5 preselected nucleotide sequence. The PCR primer pair 
is capable of initiating primer extension reactions by 
hybridizing to the PGR primer binding site on 
nucleotide sequences of the identifier nucleotide, 
preferably at least about 10 nucleotides in length, 

10 and more preferably is at least about 20 nucleotides 

in length. The first primer of a PCR primer pair is 
sometimes referred to herein as the "anti-sense 
primer" because it hybridizes to a non-coding or 
anti-sense strand of a nucleic acid, i.e., a strand 

15 complementary to a coding strand. The second primer 

of a PGR primer pair is sometimes referred to herein 
as the "sense primer" because it hybridizes to the 
coding or sense strand of a nucleic acid. 

The PCR reaction is performed by mixing the PCR 

20 primer pair, preferably a predetermined amount 
thereof, with the nucleic acids of the sample, 
preferably a predetermined amount thereof, in a PCR 
buffer to form a PCR reaction admixture. The 
admixture is thermocycled for a number of cycles , 

25 which is typically predetermined, sufficient for the 

formation of a PCR reaction product, thereby enriching 
the sample to be assayed for the identifier 
oligonucleotide in the isolated complex. 

PCR is typically carried out by thermocycling 

30 i.e. , repeatedly increasing and decreasing the 

temperature of a PCR reaction admixture within a 
temperature range whose lower limit is about 30 
degrees Celsius (30°C) to about 55 °C and whose upper 
limit is about 90 °C to about 100 °C. The increasing 

35 and decreasing can be continuous, but is preferably 
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phasic with time periods of relative temperature 
stability at each of temperatures favoring 
polynucleotide synthesis, denaturation and 
hy br id i 2 a t i on • 

5 A plurality of first primer and/or a plurality of 

second primers can be used in each amplification, 
e.g., one species of first primer can be paired with a 
number of different second primers to form several 
different primer pairs. Alternatively, an individual 
10 pair of first and second primers can be used. In any 

case, the amplification products of amplifications 
using the same or different combinations of first and 
second primers can be combined for assaying for 
mutations. 

15 The PCR reaction is performed using any suitable 

method. Generally it occurs in a buffered aqueous 
solution, i.e., a PCR buffer, preferably at a pH of 
7-9, most preferably about 8. Preferably, a molar 
excess of the primer is admixed to the buffer 

20 containing the template strand. A large molar excess 

is preferred to improve the efficiency of the process. 

The PCR buffer also contains the 
deoxyribonucleotide triphosphates (polynucleotide 
synthesis substrates) dATP, dCTP, dGTP, and dTTP and a 

25 polymerase, typically thermostable, all in adequate 
amounts for primer extension (polynucleotide 
synthesis) reaction. The resulting solution (PCR 
admixture) is heated to about 90 °C - 100°C for about 1 
to 10 minutes, preferably from 1 to 4 minutes. After 

30 this heating period the solution is allowed to cool to 
54 °C, which is preferable for primer hybridization. 
The synthesis reaction may occur at from room 
temperature up to a temperature above which the 
polymerase (inducing agent) no longer functions 

35 efficiently. Thus, for example, if DNA polymerase is 
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used as inducing agent, the temperature is generally 
no greater than about 40 °C. The thermocycling is 
repeated until the desired amount of PCR product is 
produced. An exemplary PCR buffer comprises the 
5 following: 50 mM KC1; 10 mM Tris-HCl at pH 8.3; 1.5 

mM MgCl 2 ; 0.001% (wt/vol) gelatin, 200 jiM dATP; 200 fiK 
dTTP; 200 m dCTP; 200 /iM dGTP; and 2.5 units Thermus 
aouaticus (Taq) DNA polymerase I (U.S. Patent No. 
4,889,818) per 100 microliters (Ml) of buffer. 

10 The inducing agent may be any compound or system 

which will function to accomplish the synthesis of 
primer extension products, including enzymes. Suitable 
enzymes for this purpose include, for example, E. coli 
DNA polymerase I, Klenow fragment of E. coli DNA 

15 polymerase I, T4 DNA polymerase, other available DNA 

polymerases, reverse transcriptase, and other enzymes, 
including heat-stable enzymes, which will facilitate 
combination of the nucleotides in the proper manner to 
form the primer extension products which are 

20 complementary to each nucleic acid strand. Generally, 

the synthesis will be initiated at the 3' end of each 
primer and proceed in the 5' direction along the 
template strand, until synthesis terminates, producing 
molecules of different lengths. There may be inducing 

25 agents, however, which initiate synthesis at the 5 f 

end and proceed in the above direction, using the same 
process as described above. 

The inducing agent also may be a compound or 
system which will function to accomplish the synthesis 

30 * of RNA primer extension products, including enzymes. 

In preferred embodiments, the inducing agent may be a 
DNA-dependent RNA polymerase such as T7 RNA 
polymerase, T3 RNA polymerase or SP6 RNA polymerase. 
These polymerases produce a complementary RNA 

35 polynucleotide. The high turn-over rate of the RNA 
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polymerase amplifies the starting polynucleotide as 
has been described by Chamberlin et al., The Enzvmes . 
ed, P. Boyer, pp. 87-108, Academic Press, New York 
(1982) . Amplification systems based on transcription 
5 have been described by Gingeras et al., in PCR 

Protocols. A Gu ide to Methods and Applications , pp. 
245-252, Innis et al., eds, Academic Press, Inc., San 
Diego, CA (1990) . 

If the inducing agent is a DNA-dependent RNA 
10 polymerase and, therefore incorporates ribonucleotide 

triphosphates, sufficient amounts of ATP, CTP, GTP and 
UTP are admixed to the primer extension reaction 
admixture and the resulting solution is treated as 
described above. 
15 The newly synthesized strand and its 

complementary nucleic acid strand form a 
double-stranded molecule which can be used in the 
succeeding steps of the process. 

PCR amplification methods are described in detail 
in U.S. Patent Nos. 4,683,192, 4,683,202, 4,800,159, 
and 4,965,188, and at least in several texts including 

PCR Technology: Principles and Applications for DNA 

Amplification, H. Erlich, ed. , Stockton Press, New 
York (1989) ; and PCR Protocols: A Guide to Methods 
25 and Applications. Innis et al. , eds., Academic Press, 

San Diego, California (1990). 

The term "polynucleotide" as used herein in 
reference to primers, probes and nucleic acid 
fragments or segments to be synthesized by primer 
30 extension is defined as a molecule comprised of two or 

more deoxyribonucleotides or ribonucleotides, 
preferably more than three. Its exact size will 
depend on many factors, which in turn depends on the 
ultimate conditions of use. 
35 The term "primer" as used herein refers to a 
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polynucleotide whether purified from a nucleic acid 
restriction digest or produced synthetically, which is 
capable of acting as a point of initiation of nucleic 
acid synthesis when placed under conditions in which 
5 synthesis of a primer extension product which is 

complementary to a nucleic acid strand is induced, 
i.e., in the presence of nucleotides and an agent for 
polymerization such as DNA polymerase, reverse 
transcriptase and the like, and at a suitable 

10 temperature and pH. The primer is preferably single 

stranded for maximum efficiency, but may alternatively 
be in double stranded form. If double stranded, the 
primer is first treated to separate it from its 
complementary strand before being used to prepare 

15 extension products. Preferably, the primer is a 

polydeoxyribonucleotide. The primer must be 
sufficiently long to prime the synthesis of extension 
products in the presence of the agents for 
polymerization. The exact lengths of the primers will 

20 depend on many factors, including temperature and the 
source of primer. For example, depending on the 
complexity of the target sequence, a polynucleotide 
primer typically contains 15 to 25 or more 
nucleotides, although it can contain fewer 

25 nucleotides. Short primer molecules generally require 

cooler temperatures to form sufficiently stable hybrid 
complexes with template. 

The primers used herein are selected to be 
"substantially" complementary to the different strands 

30 of each specific sequence to be synthesized or 

amplified. This means that the primer must be 
sufficiently complementary to non-randomly hybridize 
with its respective template strand. Therefore, the 
primer sequence may or may not reflect the exact 

35 sequence of the template. For example, a 
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non-complementary nucleotide fragment can be attached 
to the 5 1 end of the primer, with the remainder of the 
primer sequence being substantially complementary to 
the strand. Such non-complementary fragments 
5 typically code for an endonuclease restriction site. 

Alternatively, non-complementary bases or longer 
sequences can be interspersed into the primer, 
provided the primer sequence has sufficient 
complementarity with the sequence of the strand to be 

10 synthesized or amplified to non-randomly hybridize 

therewith and thereby form an extension product under 
polynucleotide synthesizing conditions. 

Primers of the present invention may also contain 
a DNA-dependent RNA polymerase promoter sequence or 

15 its complement. See for example, Krieg et al., Nucl. 

Acids Res., 12:7057-70 (1984); Studier et al. , J. Mol. 
Biol. / 189:113-130 (1986); and Molecular Cloning! A 
Laboratory Manua l. Second Edition , Maniatis et al., 
eds., Cold Spring Harbor, NY (1989). 

20 When a primer containing a DNA-dependent RNA 

polymerase promoter is used, the primer is hybridized 
to the polynucleotide strand to be amplified and the 
second polynucleotide strand of the DNA-dependent RNA 
polymerase promoter is completed using an inducing 

25 agent such as E. coli DNA polymerase I, or the Klenow 

fragment of E. coli DNA polymerase. The starting 
polynucleotide is amplified by alternating between the 
production of an RNA polynucleotide and DNA 
polynucleotide . 

30 Primers may also contain a template sequence or 

replication initiation site for a RNA-directed RNA 
polymerase. Typical RNA-directed RNA polymerase 
include the QB replicase described by Lizardi et ai. , 
Biotechnology, 6:1197-1202 (1988). RNA-directed 

35 polymerases produce large numbers of RNA strands from 
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a small number of template RNA strands that contain a 
template sequence or replication initiation site. 
These polymerases typically give a one million-fold 
amplification of the template strand as has been 
5 described by Kramer et al., J. Mol, Biol. . 89:719-73 6 

(1974) . 

The polynucleotide primers can be prepared using 
any suitable method, such as, for example, the 
phosphotriester or phosphodiester methods see Narang 
10 et al., tteth. E nzvmol. . 68:90, (1979); U.S. Patents 

No. 4,356,270, No. 4,458,066, No. 4,416,988, No. 
4,293,652; and Brown et al., Meth. Enzvmol. . 68:109, 
(1979). 

If the nucleic acid sample is to be enriched for 

15 the identifier oligonucleotide in the isolated complex 

by PCR amplification, two primers, i.e., a PGR primer 
pair, must be used for each coding strand of nucleic 
acid to be amplified. The first primer becomes part 
of the non-coding (anti-sense or minus or 

20 complementary) strand and hybridizes to a nucleotide 

sequence on the plus or coding strand. Second primers 
become part of the coding (sense or plus) strand and 
hybridize to a nucleotide sequence on the minus or 
non-coding strand. One or both of the first and 

25 second primers can contain a nucleotide sequence 

defining an endonuc lease recognition site as described 
herein. The site can be heterologous to the polymer 
identifier oligonucleotide being amplified. 

In one embodiment, the present invention utilizes 

30 a set of polynucleotides that form primers having a 
priming region located at the 3 '-terminus of the 
primer. The priming region is typically the 3 '-most 
(3 '-terminal) 15 to 3 0 nucleotide bases. The 
3 '-terminal priming portion of each primer is capable 

35 of acting as a primer to catalyze nucleic acid 
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synthesis, i.e., initiate a primer extension reaction 
off its 3 1 terminus. One or both of the primers can 
additionally contain a 5 1 -terminal (5 •-most) 
non-priming portion, i.e., a region that does not 
5 participate in hybridization to the preferred 

template. 

5 - Nucleic Acid Se quence Analysis 

Nucleic acid sequence analysis is a 

10 well known procedure for determining the sequence of 

nucleotides and is applied to the present methods to 
determine the nucleotide sequence in an identifier 
oligonucleotide or PCR amplification product of this 
invention. Nucleic acid sequence analysis is 

15 approached by a combination of (a) physiochemical 

techniques, based on the hybridization or denaturation 
of a probe strand plus its complementary target, and 
(b) enzymatic reactions with endonucleases, ligases/ 
and polymerases. 

20 *n assays using nucleic acid hybridization, 

detecting the presence of a DNA duplex in a process of 
the present invention can be accomplished by a variety 
of means. 

In one approach for detecting the presence of a 
25 DNA duplex, an oligonucleotide that is hybridized in 

the DNA duplex includes a label or indicating group 
that will render the duplex detectable. Typically 
such labels include radioactive atoms, chemically 
modified nucleotide bases, and the like. 
30 The oligonucleotide can be labeled, i.e., 

operatively linked to an indicating means or group, 
and used to detect the presence of a specific 
nucleotide sequence in a target template. 

Radioactive elements operatively linked to or 
35 present as part of an oligonucleotide probe (labeled 
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oligonucleotide) provide a useful means to facilitate 
the detection of a DNA duplex. A typical radioactive 
element is one that produces beta ray emissions. 
Elements that emit beta rays, such as 3 H, U C, 32 P and 
5 35 S represent a class of beta ray emission-producing 
radioactive element labels. A radioactive 
polynucleotide probe is typically prepared by 
enzymatic incorporation of radioactively labeled 
nucleotides into a nucleic acid using DNA kinase. 

10 Alternatives to radioactively labeled 

oligonucleotides are oligonucleotides that are 
chemically modified to contain metal complexing 
agents, biotin-containing groups, fluorescent 
compounds, and the like. 

15 One useful metal complexing agent is a lanthanide 

chelate formed by a lanthanide and an aromatic 
beta-diketone, the lanthanide being bound to the 
nucleic acid or oligonucleotide via a chelate-f orming 
compound such as an EDTA-analogue so that a 

20 fluorescent lanthanide complex is formed. See U.S. 

Patent No. 4,374,120, No. 4,569,790 and published 
Patent Application Nos. EP0139675 and WO87/0270S. 

Biotin or acridine ester-labeled oligonucleotides 
and their use to label polynucleotides have been 

25 described. See U.S. Patent No. 4,707,404, published 
Patent Application EP02 12951 and European Patent No. 
0087636. Useful fluorescent marker compounds include 
fluorescein, rhodamine, Texas Red, NBD and the like. 

A labeled oligonucleotide present in a DNA duplex 

30 renders the duplex itself labeled and therefore 

distinguishable over other nucleic acids present in a 
sample to be assayed. Detecting the presence of the 
label in the duplex and thereby the presence of the 
duplex, typically involves separating the DNA duplex 

35 from any labeled oligonucleotide probe that is not 
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hybridized to a DNA duplex. 

Techniques for the separation of single stranded 
oligonucleotide, such as non-hybridized labeled 
oligonucleotide probe, from DNA duplex are well known, 
5 and typically involve the separation of single 

stranded from double stranded nucleic acids on the 
basis of their chemical properties. More often 
separation techniques involve the use of a 
heterogeneous hybridization format in which the 

10 non-hybridized probe is separated, typically by 

washing, from the DNA duplex that is bound to an 
insoluble matrix. Exemplary is the Southern blot 
technique, in which the matrix is a nitrocellulose 
sheet and the label is 32 P. Southern, J. Mol. Biol. . 

15 98:503 (1975). 

The oligonucleotides can also be advantageously 
linked, typically at or near their 5« -terminus, to a 
solid matrix, i.e., aqueous insoluble solid support as 
previously described, 

20 It is also possible to add "linking" nucleotides 

to the 5 1 or 3' end of the member oligonucleotide, and 
use the linking oligonucleotide to operatively link 
the member to the solid support. 

In nucleotide hybridizing assays, the 

25 hybridization reaction mixture is maintained in the 

contemplated method under hybridizing conditions for a 
time period sufficient for the oligonucleotides having 
complementarity to the predetermined sequence on the 
template to hybridize to complementary nucleic acid 

3 0 sequences present in the template to form a 

hybridization product, i.e., a complex containing 
oligonucleotide and target nucleic acid. 

The phrase "hybridizing conditions" and its 
grammatical equivalents, when used with a maintenance 

35 time period, indicates subjecting the hybridization 
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reaction admixture, in the context of the 
concentrations of reactants and accompanying reagents 
in the admixture, to time, temperature and pH 
conditions sufficient to allow one or more 
5 oligonucleotides to anneal with the target sequence, 

to form a nucleic acid duplex. Such time, temperature 
and pH conditions required to accomplish hybridization 
depend, as is well known in the art, on the length of 
the oligonucleotide to be hybridized, the degree of 

10 complementarity between the oligonucleotide and the 
target, the guanine and cytosine content of the 
oligonucleotide, the stringency of hybridization 
desired, and the presence of salts or additional 
reagents in the hybridization reaction admixture as 

15 may affect the kinetics of hybridization. Methods for 
optimizing hybridization conditions for a given 
hybridization reaction admixture are well known in the 
art. 

Typical hybridizing conditions include the use of 

20 solutions buffered to pH values between 4 and 9, and 

are carried out at temperatures from 4°C to 37°C, 
preferably about 12 °C to about 30 °C, more preferably 
about 22 °C, and for time periods from 0.5 seconds to 
24 hours, preferably 2 minutes (min) to 1 hour, 

25 Exemplary are the conditions described in Example 4. 

Hybridization can be carried out in a homogeneous 
or heterogeneous format as is well known. The 
homogeneous hybridization reaction occurs entirely in 
solution, in which both the oligonucleotide and the 

30 nucleic acid sequences to be hybridized (target) are 

present in soluble forms in solution. A heterogeneous 
reaction involves the use of a matrix that is 
insoluble in the reaction medium to which either the 
oligonucleotide, polynucleotide probe or target 

35 nucleic acid is bound. 
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Where the nucleic acid containing a target 
sequence is in a double stranded (ds) form, it is 
preferred to first denature the dsDNA , as by heating 
or alkali treatment, prior to conducting the 
hybridization reaction. The denaturation of the dsDNA 
can be carried out prior to admixture with a 
oligonucleotide to be hybridized, or can be carried 
out after the admixture of the dsDNA with the 
oligonucleotide . 

Effective amounts of the oligonucleotide present 
in the hybridization reaction admixture are generally 
well known and are typically expressed in terms of 
molar ratios between the oligonucleotide to be 
hybridized and the template. Preferred ratios are 
15 hybridization reaction mixtures containing equimolar 
amounts of the target sequence and the ( 
oligonucleotide. As is well known, deviations from 
equal molarity will produce hybridization reaction 
products, although at lower efficiency. Thus, 
20 although ratios where one component can be in as much 
as 100 fold molar excess relative to the other 
component, excesses of less than 50 fold, preferably 
less than 10 fold, and more preferably less than two 
fold are desirable in practicing the invention. 

25 

Examples 

The following examples are intended to 
illustrate, but not limit, the present invention. 

30 1* Preparation of Protected Amino Acids 

The synthesis of a bifunctional molecule requires 
protected amino acids. The amino-terminus of the 
amino acid is protected with f luorenylmethoxy car bony 1 
(FMOC) and the carboxy-terminus is protected with a 

35 pentaf luorophenyl ester (Opfp) . The amino acids 
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lysine, cysteine, tyrosine, serine r threonine, 
arginine, histidine, tryptophan, aspartic acid, and 
glutamic acid, require additional protection of their 
side chains (R groups) . 
5 Most of the FMOC and Opfp protected amino acids 

are commercially available and were obtained from 
Bachem Biosciences, Inc. (Philadelphia, PA) . The 
terminology used herein for their structure is 
indicated by the following example for glycine (Gly) : 

10 FMOC-Gly-Opf p , where FMOC and Opfp are the amino and 
carboxy terminal protecting groups. For side chain 
protection, the following protected amino acids are 
available from Bachem: FMOC-Arg (MTr)-Opfp having the 
substituent N 9 -4-methoxy-2 , 3 , 6-tr imethylbenzene 

15 sulf onyl arginine (MTr) at the side chain amino 

terminus of arginine; FMOC-His (Bum) -Opfp having the 
substituent N^tert-butoxymethylhistidine (Bum) at the 
heterocyclic reactive nitrogen in histidine; FMOC- 
Cys(Trt) -Opfp having the substituent S-triphenyl 

20 methyl cysteine at the side chain sulfur of cysteine; 

FMOC-Trp(N-For) -Opfp having a formyl group at the 
amino group of tryptophan's heterocyclic group; and 
FMOC-Lys(N-Bz)-Opfp having a benzyl group on the free 
amino group of lysine's side chain; 

25 where the structure in parenthesis indicates the 

protecting group on the reactive side chain. 

FMOC-Tyr (OTBS) -Opfp having a 
tertbutyldimethysilyl (TBS) ester on the side chain 
hydroxy of tyrosine is prepared by reacting an excess 

30 of formic acid with FMOC-Tyr (tert-butyl) -Opfp (Bachem) 
to remove the tert butyl group from the protected 
hydroxyl group to form FMOC-Tyr-Opf p . Thereafter, one 
equivalent of FMOC-Tyr-Opf p is reacted with 1.2 
equivalents of TBS-C1 and 1.5 equivalents of imidazole 

35 in DCM at room temperature for 12 hours under inert 
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atmosphere to form FMOC-Tyr (OTBS) -Opf p. 

FMOC-Ser(OTBS)-Opfp is similarly prepared using 
FMOC-Ser(tert-butyl)-Opfp (Bachem) in the reaction. 
FMOC-Thr(OTBS)-Opfp is also prepared in this manner 
5 using FMOC-Thr (tert-butyl) -Opf p (Bachem). 

FMOC-Asp (TMSE) -Opf p having a trimethylsilyl ethyl 
ester (TMSE) on the side chain carboxyl group of 
aspartic acid is prepared by first reacting one 
equivalent of FMOC-Asp-O-tertbutyl (Bachem) with 1.5 

10 equivalents of 2-trimethylsilylethanol and 1.5 

equivalents if dicyclocarbodiimide (DCC) in ethyl 
acetate for 12 hours at room temperature under inert 
atmosphere to form FMOC-Asp (OTMSE) -O-tertbuty 1 . There 
after the TMSE ester is reacted with an excess of 

15 formic acid at room temperature for 14 hours to 

hydro lyze the tertbutyl moiety and form a free 
carboxyl terminus in the form of FMOC-Asp (OTMSE) -COOH. 
The formic acid is evaporated, and the 1 equivalent of 
the remaining amino acid is admixed with 1.1 

20 equivalent of pentaf luorophenol (pfp; Bachem) 

and 1.1 equivalent of DCC for 12 hours at room 
temperature under inert atmosphere to form the product 
FMOC-Asp (TMSE) -Opf p. The product is isolated from 
unreacted pfp, DCC and precursor amino acid by silica 

25 gel chromatography using 10% (v/v) ethyl acetate in 

hexane . 

FMOC-Glu (TMSE) -Opf p having a TMSE ester on the 
side chain carboxyl group of glutamic acid is prepared 
as described above to prepare FMOC-,TMSE- and pfp 
30 protected aspartic acid, except that FMOC-Glu-O-tert- 
butyl (Bachem) is used in place of the aspartic acid 
precursor, to form FMOC-Glu (TMSE) -Opf p. 



35 
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2 . Preparation of Solid Support 
a. Teflon Support 

A solid support designated ^-(S'-O- 
dimethyoxytrityl-2 1 , 3 1 -diacetyl 1-adenylyl) -teflon 
5 support was obtained from Glen Research (Herndon, VA) , 

The solid support is a teflon resin with a modified 
adenine nucleoside having the solid support linkage 
through the 6-amino group of a purine base, a 
dimethoxytrityl ether (DMT) at the 5 1 position of the 

10 ribose ring and acetate esters at the 2 • and 3 1 

positions of the ribose ring. The solid support was 
admixed with 5 volumes of 3% (v/v) dichloroacetic acid 
in dichloromethane (3% DCA in DCM) and maintained for 
10 minutes at room temperature under inert atmosphere 

15 to remove the dimethoxytrityl (DMT) protecting group 

to form a free 5 f hydroxyl. The resulting deprotected 
solid support was washed 3 times with DCM to remove 
the excess unreacted DCA. The deprotected washed 
teflon solid support is ready for coupling to linker. 

20 

3. Coupling of Linker to Solid Support 
a. Teflon Support 

One equivalent of deprotected washed solid 
support (prepared in Example 2) was admixed with 20 

25 equivalents of a linker designated (1- 

dimethoxytr ity loxy-3 -f luoreny 1 methoxycarbonylamino- 
propan-2yl) - (2-cyanoethyl) - (N, N-diisopropyl) - 
phosphoramidite and referred to as 5 1 Branched- 
Modifier C3 (or S'BMCS; available from Glen Research) 

30 and 20 equivalents of tetrazole (0.45M in 

acetonitrile, Glen Research) and was agitated with 
stirring for 0.5 to 1 hour at room temperature to 
under inert atmosphere. The admixture was then washed 
with an excess of acetonitrile to remove unreacted 

35 reagents. The washed solid phase material was 



WO 93/20242 PCT/US93/03127 

""61" 

retained and admixed with 2 equivalents of Iodine in 
tetrahydrofuran/ water, 9:1, (available from Glen 
Research) under inert atmosphere and maintained at 
room temperature for 10 minutes to form oxidized 
5 solid-support coupled linker. 

One equivalent of oxidized solid support- 
coupled linker was then admixed with 20 equivalents of 
acetonitrile/acetic anhydride, 88:12, (capping 
reagent; Glen Research) for 10 minutes at room 
10 temperature to cap any unreacted free hydroxyls 

present on the solid support and form capped teflon- 
solid support-coupled linker. 



15 



20 



25 



30 



35 



b ' Controlled Pore Glass frPG) Support 

A controlled pore glass (CPG) support and 
linker is particularly preferred that can also be used 
in the present invention. The complete structure of 
an exemplary CPG support is shown in Figure 3, and 
includes (1) a sarcosine linker moiety connecting the 
linker to the CPG support, (2) a succinyl-aminohexanol 
linker that is cleavable by aqueous ammonia to release 
the polypeptide-linker-oligonucleotide conjugate (A-B- 
C) from the solid support, and (3) a serine branch- 
monomer having a photolabile cleavage site for 
releasing the oligonucleotide from the solid support. 
The synthesis is carried out stepwise as described 
below. 

N-Fmoc-amidohexan-i-ol (l) was first prepared to 
form the aminohexanol linker moiety. To that end, 
6-Amino-l-hexanol (0.75 g, 6.4 mmol) was dissolved in 
sat. aq. Na 2 C0 3 (10 ml) and cooled on ice. 
9-Fluorenylmethyl chlorof ormate (Fmoc-Cl, 1.83 g, 7.1 
mmol) in THF (25 ml, freshly distilled) was added 
slowly under vigorous stirring. The solution is 
acidified with 10% (w/v) citric acid, extracted with 
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ethyl acetate and filtered through Celite 545 before 
separation of the phases. The organic phase was dried 
(Na 2 S0 4 ) , filtered, and evaporated to yield the crude 
product (2*15 g) . Pure product was obtaind by 
5 recrystalization from ethylacetate/hexane (100 ml, 

8/2, v/v) to form the alcohol 1. The yield of alcohol 
1 was 1.6 g . 

N-Fmoc-amidohexan-l-yl succinate (2) was prepared 
to form the succinyl-aminohexanol linker moiety. The 

10 alcohol 1 (758 mg, 2»2 mmol) was dissolved in anh. 

pyridine (5 ml) and evaporated to dryness in vacuo. 
The evaporated oil was redissolved in anh. pyridine (3 
ml) and succinic anhydride (229 mg, 2.3 mmol), 
4-dimethylaminopyridine (DMAP, 12.5 mg, 0.1 mmol), and 

15 diisopropylethylamine (DIPEA, 370 /il, 278 mg, 2.2 

mmol) was added with stirring under an inert 
atmosphere (Ar) . The reaction mixture was stirred 
over night, evaporated to dryness, redissolved in 
dichloromethane (100 ml) and extracted with dilute aq. 

20 hydrochloric acid (1 M, 50 ml) . The organic phase was 

dried (Na^S0 4 ) , filtered, and evaporated to yield the 
crude product as an oil designated compound 2 which 
could be recrystalized from 2 -propanoic The yield of 
compound 2 was 735 mg (as white crystals) , having a 

25 melting point (Mp.) of 67-68£ (uncorr.); and a mass 
spectroscopy (MS) (EI, m/e) of 440 (calcd for C 25 H 29 N0 6 
+ H 440) . 

O- (4 , 4 f -Dimethoxytr ity 1 ) -N-Fmoc-i-Serine ( 3 ) was 

prepared to form the serine-branch linker moiety 
30 containing the FMOC and O-DMT termini. Fmoc-L-Serine 

(4.04 g, 12.2 mmol) was dissolved in anh. pyridine (10 
ml) and evaporated to dryness in vacuo. This process 
was repeated twice. The evaporated oil was 
redissolved in anh. pyridine (12 ml) and 
35 4 ,4' -dimethoxytr ity 1 chloride (4.20 g, 12.4 mmol) was 
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added with stirring at room temperature under an inert 
atmosphere (Ar) . The reaction mixture was stirred 
over night, evaporated to dryness, redissolved in 
chloroform (4 x 25 ml) and extracted with sat. aq. 
5 NaHC0 3 (50 ml) . The aq. phase was back-extracted once 

with chloroform (25 ml) and the combined organic 
phases were dried from Na 2 SO^ f filtered, and 
evaporated to yield the crude product as a brownish 
oil. The oil was dissolved in chloroform/ ethylacetate 

10 and triturated once with hexane and then purified by 

silica gel column chromatography (150 ml silica) using 
CHClj/MeOH/DIPEA, 94/5/1, as the mobile phase. The 
fractions were analyzed by tic and the appropriate 
fractions were pooled and evaporated to give a white 

15 foam designated compound 3. The yield of compound 3 

was 4.29 g, and the 

MS (EI, m/e) was 630 (calcd for C 39 H 35 N0 7 + H 630). 

Controlled pore glass (CPG) was activated for * 
coupling to the above prepared linkers. To that end, 

20 CPG (Sigma G5019, aminopropyl-CPG, 4.97 g) was 

suspended in trichloroacetic acid in dichloromethane 
(3%, 20 ml) and agitated in a shaker for 4 hours. The 
CPG was isolated by filtration, washed three times 
with dichloromethane (20 ml) , three times with DIPEA 

25 in chloroform (10%, 20 ml) and three times with 

diethyl ether to form activated CPG. The activated 
CPG was dried in vacuo. 

Bi-functional CPG [bf-CPG, 
CPG-Sar-suc-aho-Ser (O-DMT) (W-Pmoc) ] was then prepared 

30 comprising all of the element shown in Figure 3. 

i. Activated CPG (2.0 g, loading 83 
Mmole/g) was placed in a filter funnel (25 ml). 
Fmoc-Sar (313.5 mg, mmol) was dissolved in DMF (2.8 
ml), mixed with pyBOP (525.9 mg, mmol) in DMF (2.8 ml). 

35 and the combined, activated amino acid mixture was 
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added to the CPG. DXPEA (1 ml) was added to the 
reaction mixture and the CPG shaked for i hour at room 
temperature. Chloroform (5 ml) was added and the 
shaking continued for another 3 hours. The 
5 Sar-functionalized CPG was recovered by filtration and 

washed with DMF (2 x) , dichloromethane (2 x) , diethyl 
ether (2x) and dried in vacuo. To obtain maximal 
loading, this process was repeated to give CPG-Sar 
with a loading of 77 /xmole/g. Eventual remaining free 
10 amino groups were capped with acetic anhydride /DMAP 

followed by an extensive washing and drying in vacuo 
to form Sar-functionalized CPG, 

ii. The Sar-functionalized CPG (2.0 g) was 
treated with piperidine in DMF (2/8, 2 x 5 minutes) 

15 and reacted with compound 2 (450 mg) , pyBOP (575 mg) 

and DIPEA (1000 ill) in DMF (2 x 2.8 ml) . The 
resulting Sar-suc-aho-Fmoc-functionalized CPG was 
recovered by filtration and washed with DMF (2 x) , * 
dichloromethane (2 x) , diethyl ether (2.x) and dried in 

20 vacuo . 

iii. The Sar-suc-aho-Fmoc-functionalized CPG 
was treated with piperidine in DMF (2/8, 2 x 10 
minutes) and reacted with compound 3 (764 mg) , pyBOP 
(536 mg) and DIPEA (1000 Ml) in DMF (2 x 2.8 ml) to 

25 form a bifunctional CPG support or "bf-CPG". The 

loading was determined to be 61.7 /xmole/g as judged 
from the Fmoc/piperidine adduct absorption at 302 nm. 

In an additional embodiment, the above bf-CPG is 
further functionalized by adding a photolabile 

30 cleavage site for removal of the polypeptide from the 

conjugate. 

To that end, 3-nitro-4-bromomethylbenzoic acid is 

prepared as described by Rich et al., J. Am. Chem. 
Soc- . 97:1575-1579 (1975) using a-bromo-p-toluic acid 
35 (4-bromomethulbenzoic acid; commercially available 
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from Aldrich Chemical Company, Milwaulkee, WI) . The 
3-nitro-4-bromomethylbenzoic acid moiety is added to 
the above bf-CPG support by first deblocking the N- 
terminal function by a treatment with piper id ine in 
5 DMF (2/8, 2 x 5 minutes), then washed thoroughly and 

reacted with the 3-nitro-4-bromomethylbenzoic acid 
that is activated with pyBOP to form a photocleavable 
bf-CPG having the structure shown in Figure 3. The 
completeness of the coupling reaction is monitored by 

10 a positive and a negative Kaiser test after the FMOC 

deprotection and the coupling reaction, respectively. 

Coupling of an amino acid residue Monomer to the 
phocleavable bf-CPG (also referred to as o-NB-bf-CPG) 
is carried out by the esterif ication procedure 

15 described by Rich et al., supra . and by Barany et al. , 

J . Am. Chem. Soc. . 107:4936-4942 (1985) 

The resulting bf-CPG linker /support molecules 
have novel and useful features in the context of the 
present invention described hereinbelow. 

20 (i) The linker/ support molecule has an aqueous 

ammonia sensitive cleavage site in the 
sarcosine/succinyl moiety which is stable to Fmoc 
peptide synthesis conditions, and which upon cleavage 
releases the peptide-oligonucleotide conjugate free 

25 from the solid CPG support, thereby allowing for the 
production of a soluble phase form of the chemical 
library. A soluble phase chemical library is 
particularly suited for use in PCR reactions, and for 
measuring peptide interactions in solution phase 

30 binding interactions. 

(2) The L-serine branch-monomer renders the 
peptide-oligonucleotide conjugate isomerically pure. 

(3) The photolabile linker/ support molecule (o- 
NB-bf-CPG) has a light sensitive (hv) cleavage site, 

35 indicated by the arrow in Figure 3, that is sensitive 
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to light of about 350 nm and which upon cleavage 
results in the release of soluble polypeptide. 
Cleavage can be regulated by the time and intensity of 
irradiation using, for example, a Rayonet RPR Reactor 
5 as described by Barany et al., supra . 

4. Coupling of Nucleotide to Solid Support 

a. Deprotect DMT 

The capped teflon solid support-coupled 
10 linker prepared in Example 3 was admixed with 3% DCA 

in DCM for 10 minutes at room temperature under inert 
atmospheres to remove the DMT protecting group from 
the linker and form a free hydroxy 1 group. The 
deprotected linker/ support was then washed 3 times 
15 with DCM. The deprotected linker /support is ready for 

addition of nucleotide. 

b. Addition of Nucleotide to Teflon Support 
One equivalent of deprotected linker /support 

20 was admixed with about 20 equivalents of a desired 

blocked nucleotide phosphoramidite and 20 equivalents 
of tetrazole (0.45M in acetonitrile) to form a coupled 
nucleotide/ linker/ support (coupled nucleotide 
complex) . The coupled nucleotide complex was then 

25 washed with an excess of acetonitrile to remove 

unreacted reagents. All blocked nucleotide 
phosphoramidites were obtained from Glen Research and 
contain a DMT protected blocked S^hydroxyl, a 
cyanoethyl ester (CNE) and a diisopropylamine group at 

30 the 3 1 -phosphoramidite. In addition, the adenine and 
cytosine derivatives contained a benzoyl group on the 
base's free nitrogen and the guanosine derivative 
contains an isobutyl group on the 2-amino group of the 
purine base. 

35 One equivalent of coupled nucleotide complex 
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was then admixed with 2 equivalents of Iodine in 
tetrahydrofuran/ water, 9:1, for oxidation as before in 
Example 3 to oxidize the coupled nucleotide complex. 
Thereafter, one equivalent of oxidized 
5 nucleotide complex was admixed with 2 0 equivalents of 

capping reagent as before in Example 3 to cap any 
unreacted free hydroxyls and to form capped solid 
phase-coupled nucleotide complex. 

10 c - Couplin g of Nucleotide to Bifunctional CPG 

As a demonstration of coupling chemistry for 
addition of a nucleotide residue to the bifunctional 
CPG support, we have synthesized and coupled an 
oligodeoxyribonucleotide onto bf-CPG. To that end, 

15 oligonucleotides were synthesized on a ABI 3 94 DNA 

synthesizer using the standard 1 /umole scale synthesis 
cycle and commercially available reagents and 
phosphoramidites. The solid support used was the 
bf-CPG (20 mg, i jumol) placed in a commercially 

20 available empty synthesis column for the ABI 

synthesizers. Repetitive yield were calculated to 
98.9% as judged from collecting the detritylation, 
deluting in p-toluene sulfonic acid monohydrate in 
acetonitril (0.1 M) and measuring the absorptions at 

25 498 nm. 

The oligonucleotide was released from the support 
by cone, aq. ammonia (shaking for more than 24 hours) . 
The oligonucleotide was isolated using commercially 
available OPC-cartridges and analyzed by HPLC and PAGE 
3 0 (radiolabeled with 32 P-g-ATP and T4 -kinase) . 

5 • Couplin g of Amino Acid to Solid Support 
a* Deorotect Nucleotide Complex 

One equivalent of capped teflon solid phase- 
3 5 coupled nucleotide complex was admixed with 1 
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equivalent: of 1 , 8 , diazabicyclo [5,4,0] undec-7-ene (DBU; 
available from Aldrich Chemical Co. f Milwaukee, WI) 
in DCM under inert atmosphere for 10 minutes at room 
temperature to remove (deblock) the 
5 fluoromethoxy car bony 1 (FMOC) protecting group from the 

linker in the nucleotide complex. The deprotected 
nucleotide complex was then washed with excess DCM to 
remove unreacted DBU and form a deprotected nucleotide 
complex with a free amino group. 

10 

b. Addition of Amino Acid 

One equivalent of deprotected nucleotide 
complex from Example 5A was admixed with 20 
equivalents of protected amino acid in 

15 dimethylformamide (DMF) and 20 equivalents of 1- 

hydroxy-benzotriazole (HOBt) under inert atmosphere 
for 0.5 to 1 hour at room temperature. This reaction 
condition couples the carboxy-terminus of the amino 
acid via its pent a f luorophenyl ester to the free amino 

20 group of the nucleotide complex to form a 

nucleotide/amino acid conjugate (the conjugate) . The 
conjugate was then washed in excess DCM to remove 
unreacted HOBt and the precursor amino acid. The 
protected amino acid is one of those as described in 

25 Example lr having FMOC and opfp at the amino and 

carboxy terminus, and if needed, a blocking group on 
the side chain as described before. 



c. Coupling of Amino Acid to Bifunctional CPG 
30 As a demonstration of coupling chemistry for 
addition of an amino acid residue to the bifunctional 
CPG support, we have synthesized a peptide, 
H-HPQFVS-aho, and coupled the peptide to bf-CPG. To 
that end, bf-CPG (428 mg, 26 ^mol) was placed in a 
35 filter funnel hooked up in a agitating device of an 



WO 93/20242 „ PCT/US93/03127 

■"69 — 

device. Reagents used were the 

commercially available Fmoc amino acid derivatives 
either as O-pfp esters (for Val and His [Fmoc]) or the 
free acid acivated with pyBOP (for Phe, Gin and Pro) . 
5 Coupling was conducted in the automated peptide 

synthesizer using conventional 

chemistry. The success of the coupling reaction was 
monitored by a positive and a negative Kaiser test 
after the Fmoc deprotection and the coupling steps, 
10 respectively* 

The peptide was deprotected and released from an 
aliquot of the bf-CPG by treatment with i. piperidine 
in DMF (2/8, 2 x 5 minutes) ii. cone, aq. ammonia 
(shaking for 24 hours) . The peptide was analyzed by 
15 reversed phase HPLC as one major peak. MS (Ion Spray, 

m/e) 813 (calcd for C3 9 H 60 N 10 O 9 + H 813). 

6. Elongation of the Conjugate 

The conjugate can be lengthened by alternating 
20 cycles of addition of nucleotides and amino acids. The 

following alternating cycles are repeated until the 
conjugate has desired length amino acid polymer and 
oligonucleotide polymer. 



25 a. Addition of Nucleotides 

To couple an additional nucleotide, the 5»- 
OH on the terminal nucleotide is deprotected with DCA 
following the protocol described previously in Example 
4 A for the deprotection of the teflon linker/ support . 

30 Thereafter, a protected nucleotide is added as 

described in Example 4B. 

b. Addition of Amino Acids 

To couple an additional amino acid, the 
35 amino-FMOC terminus of the last amino acid added to 
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the conjugate Is deprotected with DBU as described 
previously in Example 5A. Thereafter , a protected 
amino acid is added as described in 
Example 56. 

5 The cycle of Steps 1 and 2 above adding alternate 

nucleotides and amino acids can be repeated until the 
conjugate has polymers of the desired length and 
structure . 

10 c. Assembly of Polypeptide-Oligonucleotide CPG 

Conjugates 

As a demonstration of coupling chemistry for 
preparation of a peptide-oligonucleotide conjugate on 
a bifunctional CPG support, we have synthesized and 

15 coupled an oligodeoxyribonucleotide onto bf-CPG. To 

that end, oligonucleotides (SEQ ID NO : TCT CTC TCT 

AAA CTC GGG TCT CTC 1 ; and SEQ ID NO : AGC TAG TTC 

CCA AGG ATC ACC ACA CTA GCG GGG CCC TAT TCT TAG 2 ) 
were synthesized on a ABI 394 DNA synthesizer using 

20 the standard 1 jimole scale synthesis cycle and 

commercially available reagents and phosphoramidites 
according to standard synthetic chemistries. The 
solid support used was the 

Fmoc-(Fmoc)HPQFVS(DMT)-aho-CPG (20 mg; 1 nmol) 

25 produced above in Example that was placed in a 

commercially available empty synthesis column for the 
ABI synthesizers. Repetitive yield were calculated to 
respectively 97.9% (24-mer) and 98.3% (45-mer) as 
judged from collecting the detritylation, deluting in 
30 p-toluene sulfonic acid monohydrate in acetonitril 



24-Mer test sequence 

2 45-Mer sequence. The two flanking 15-mers are 
PCR primers and the middle 15-mer is the coding sequence 
for the peptide using (abitrarily) the commaless codons 
35 CTA for F, ATC for H, ACC for P, ACA for Q and GCG for V. 
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(0.1 M) .and measuring the absorptions at 498 nm. 

The 45-mer oligonucleotide above has two flanking 
15-mer regions that provide a site for PGR primers, 
and a middle 15-mer that is the coding sequence for 
5 the peptide using (arbitrarily) the commaless codons 

CTA for F, ATC for H, ACC for P, ACA for Q and GCG for 
V. 

The peptide-oligonucleotide conjugates formed 
were realesed from the support by cone, aq. ammonia 
10 (shaking for more than 24 hours) . The 

peptide-oligonucleotide conjugates were isolated using 
commercially available OPC-cartridges and analyzed by 
HPLC and PAGE (radiolabeled with 32 P-g-ATP and 
T4-kinase) . 

15 

d- Synthesis of a Peptide-Oligonucleotide 
Conjugate Library on a bf-CPG Support 
Using a bifunctional (bf) CPG support as 
described in Example 3B, a peptide-oligonucleotide 
20 conjugate library is constructed as described below in 
the following steps: 

Step 1. The solid support, bf-CPG, is placed in 
a commercially available empty synthesis column for 
use in an ABI 394 DNA synthesizer (Applied Biosystems, 
25 Inc., Foster City, CA) . The 3'- oligonucleotide 

primer sequence (SEQ ID NO ) is synthesized and 

then chemically attached to the bf-CPG support 
according to standard manufacturer's procedures using 
the standard synthesis cycles and commercially 
30 available reagents and phosphoramidites . 

Step 2. The support after Step 2 is divided into 
a number of aliquots representing the number of 
different monomer amino acids to be added at library 
position 1 of the polypeptides to be synthesized. 
35 Each Fmoc-amino acid residue Monomer (1) is coupled 
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according to a protocol consistent with the protection 
group on the monomer. In a typical protocol bf-CPG is 
placed in a filter funnel hooked up in a agitating 
device. The reagents used are the commercially 
5 available Fmoc amino acid derivatives either as O-pfp 

esters or the free acid acivated with pyBOP. The 
functionalized bf-CPG is the treated with piperidine 
in DMF (2/8, 2 x 5 minutes) and reacted with the 
monomer, activated as described above. The success of 

10 the coupling reaction is monitored by a positive and a 

negative Kaiser test after the Fmoc deprotection and 
coupling steps, respectively. Likewise, the absorption 
at 302 nm from the deprotected Fmoc/piperidine 
conjugate acts as an indicator that the coupling 

15 reaction proceeeds succesfully. The Monomer (1)- 

functionalized bf-CPG is recovered by filtration and 
washed with DMF (2 x) , dichloromethane (2 x) , diethyl 
ether (2x) and dried in vacuo to form Monomer (1) - 
conjugated bf-CPG* 

20 Step 3. The oligonucleotide sequence that is the 

unit identifier for the appropriate Monomer (1) is 
attached to each of the Monomer (1) -functionalized 
bf-CPG 1 s by placing each of the different 
Monomer (1) -functionalized bf-CPG" s in separate 

25 commercially available empty synthesis columns for the 

DNA synthesizers. The encoding oligonucleotide 
sequence [eg. , a comaless trinucleotide unit 
designated Oligo(l) ] that corresponds to the species 
of Monomer (1) present on that 

30 Monomer (1) -functionalized bf-CPG is chemically 

attached to the Monomer (1) -functionalized bf-CPG by 3 
consecutive synthesis cycles on the DNA synthesizer as 
described previously using the standard synthesis 
cycles and commercially available reagents and 

35 phosphoramidites , to form a Monomer ( 1) , 01igo(l)-bf- 
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CPG conjugate. 

Step 4. The different Monomer (1) -functionalized 
bf-CPG's are pooled, mixed and divided into a number 
of aliquots representing the number of different amino 
5 acid residue monomers to be added in the library at 

position 2. Monomer (2) is attached to the CPG as 
described under Step 2 for Monomer (1) . 

Step 5. The oligonucleotide sequence that is the 
unit identifier for the appropriate Monomer (2) is 

10 attached to each of the Monomer (1) -functional ized 

bf-CPG's by placing each of the different 
Monomer (1) -functionalized bf-CPG ! s in separate 
commercially available empty synthesis columns for the 
DNA synthesizers. 

15 These steps, Steps 4-5, are repeated until the 

appropriate number of monomers and corresponding unit 
identifier oligonucleotides have been incorporated. 

Step 6. The combined solid supports are placed 
in a commercially available empty synthesis column for 

20 the DNA synthesizers. The 5 1 -oligonucleotide primer 

sequence is chemically attached to the 
Monomer (n) -Monomer ( 1) -functionalized bf -CPG as 
described previously using the standard synthesis 
cycles and commercially available reagents and 

25 phosphoramidites . The final DMT— group is left 

attached to the oligonucleotide-peptide conjugates to 
be used as an affinity tag in the purification step. 

Step 7. The combinatorial 
peptide-oligonucleotide library is deprotected and 

3 0 released from the support by treatment first with TBAF 

for the appropriate time, and then with cone, aq. 
ammonia (shaking for more than 24 hours). The 
peptide-oligonucleotide conjugates were isolated and 
purified using commercially available OPC-cartridges 
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7 . Removal of Protecting Groups 

After* complete synthesis of one or more 
bifunctional molecules, the protecting groups are 
removed from the terminal nucleotide, from the 
5 terminal amino acid, and from the side chains of * 
protected amino acids. 

k 

a. Removal of Nucleotide Protecting Group 

The DMT protecting group on the 5 1 -OH of the 
10 last nucleotide of the oligonucleotide polymer is 

removed with DCA following the protocol described 
previously in Example 4A. 

b. Removal of Amino Acid Protecting Group 
15 The FMOC protecting group on the amino- 

terminus of the last amino acid of the amino acid 
polymer is removed with DBU as described previously in 
Example 5A. 



20 c* Removal of Amino Acid Side Chain Protecting 

Group 

Conditions for removal of an amino acid side 
chain protecting group depends on the particular 
protecting group as follows: 

25 

i. Removal of TBS and TSE ester Groups 
One unit of the conjugate is admixed 
with about 20 equivalents of tetrabutylammonium 
fluoride (TBAF) in DCM and maintained at room 
30 temperature under inert atmosphere overnight to remove 

the TBS or TMSE ethers protecting the side chains of 
tyrosine, aspartic acid, glutamic acid, serine, and 
threonine ♦ 



35 
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ii . Removal of the Bz Group 

The conjugate is admixed with an excess 
of aqueous ammonia and maintained at 60 °C overnight 
under inert atmosphere to remove the benzyl (Bz) group 
5 protecting the side chain amino group of lysine. 



iii. Removal of the MTr. Bum and Tpm Groups 
The conjugate is admixed with 20 to 50 

percent TFA and maintained at room temperature for 
10 about 5 minutes under inert atmosphere to remove the 

MTr, Bum or Tpm groups protecting the side chains of 
arginine, histidine or cysteine, respectively. 
Thereafter, the conjugate is neutralized with 
triethanolamine and CHC1 3 . 

15 

iv. Removal of the Formyl Group 

The conjugate is admixed with aqueous 
buffer at pH 12 and maintained at room temperature for 
about 5 minutes under inert atmosphere to remove the 
20 formyl group protecting the reactive 2 -amino group of 

tryptophan . 

8. Cleavage of Conjugate from Solid Support 
a. Cleavage of the Teflon Support 

25 After the protecting groups are removed from 

the conjugate on the teflon support, the bifunctional 
molecule is removed from the solid support by admixing 
the conjugate with a cleaving solution of 100 mM 
sodium periodate, 100 mM sodium phosphate buffer, pH 

30 7.2, in acetonitrile/water (1:4 v/v) . The admixture 

is maintained with agitation at room temperature with 
exclusion of light. After 4 hours of agitation, the 
liquid phase removed and the solid support is washed 
with excess water and methanol. The wash solutions 

3 5 are then removed and 1 umole of solid support are 
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admixed with 50 ul n-propy lamine , 100 ul acetonitrile 
and 400 ul water and maintained at 55*C for 3 hours. 
Thereafter, the liquid phase is recovered, evaporated 
to dryness in vacuo , and the dried product is 
5 dissolved in acetonitrile/water. The dissolved 

product is purified using reverse phase HPLC on an EM 
LiChrospher 100RP-18m 50um column (4x25) HPLC column. 
The mobile phase A is 95 % 0.1 TEAA buffer (pH 7.0) 
and 5 % acetonitrile, and mobile phase B is 5 % TEAA 
10 buffer (pH 7.0) and 95 % acetonitrile. The gradient 

is 100 % A for 5 min r 100 % A to 50 % for 50 min, with 
a flow rate of 1 mL per min. The homogeneous fraction 
is collected to yield a solution of pure bifunctional 
molecule. 

15 The solution is dialyzed as needed to change the 

buffer of the purified material. 

b. Cleavage of the bf-CPG Support 

The peptide-oligonucleotide conjugate 
20 coupled to the bf-CPG support can be cleaved in two 

different locations. 

As shown in Figure 3, an aqueous ammonia reaction 
cleaves the sarcosine-succinyl linker moiety, thereby 
releasing the peptide-oligonucleotide conjugate from 
25 the solid support, forming a solution phase conjugate. 

As also shown in Figure 3 , irradiation with light 
at about 350 nm will cleave the photoreactive site 
present on the serine branch-monomer moiety, thereby 
releasing the polypeptide from the solid support. 
30 In both cases, the solid support can readily be 

separated from the released, soluble, material by 
filtration to form isolated soluble conjugate or 
polypeptide, depending upon the cleavage reaction. 



35 
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9 . Preparation of a Library of Bifunctional 
Molecules 

Using the synthetic procedures of Examples 1-8, 
the methods for producing a bifunctional molecule are 
5 detailed. To form a library of molecules, additional 

manipulations are required. First, the synthesis is 
conducted including the steps of aliquoting, adding 
different units to each aliquot, and pooling the 
aliquots to sequentially build the library. Second, 
10 if desired, the PCR primer binding sites and the unit 

identifier oligonucleotides can be added as 
presynthesized blocks rather than added nucleotide by 
nucleotide. 

15 a. Synthesis of Protected Oligonucleotides 

Using this procedure, PCR primer 
binding site oligonucleotides Pi and P2 were 
synthesized having the nucleotide sequences shown in 
Table 2, but having an DMT at the oligonucleotide's 5" 

20 terminus, and having a CNE ester and an amino 

diisopropyl phosphoramidate at the oligonucleotide^ 
3* terminus. Similarly, unit identifier 
oligonucleotides were synthesized for glycine (Gly) 
and methionine (Met) having 6 nucleotides per unit and 

25 having the blocked termini described above. The unit 

identifier oligonucleotide sequences are shown in 
Table 2. 

TABLE 2 

30 Designation Oligonucleotide Sequence 



PI 5 » - GGGC OCT ATTCTT AG - 3 ■ 

P2 5 1 -AGCTACTTCCCAAGG-3 ' 

Z gly 5'-CTCATG-3 • 

Z mt 5 1 -ACGGTA-3 1 



35 
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b. Synthesis of a Library 

The synthesis of a prototype library is 
described where the chemical unit is an amino acid, 
the alphabet size is 2, being comprised of glycine and 
5 methionine, the unit identifier nucleotide sequence is 
6 nucleotides in length, and the chemical polymer 
length is three amino acids in length. A schematic of 
the process is shown in Figure 2 . 

The solid support prepared in Example 2 is 
10 coupled to the linker as described in Example 3. For 
convenience, the solid support-coupled linker molecule 
is referred to as LINK. Thereafter, protected 
oligonucleotide PI is coupled to LINK as described for 
a single protected nucleotide in Example 4 to form the 
15 structure PI-LINK. 

In Step 1, PI-LINK is divided into two aliquots. 
The first aliquot is subjected to the sequential 
coupling of the amino acid residue glycine as 
described in Example 5, and then coupling of the 
20 protected oligonucleotide Z 9ly as described in Example 

4 to form the structure CTCATG-Pl-LINK-gly . The 
second aliquot is similarly coupled to add the amino 
acid methionine and the oligonucleotide Z"** to form 
the structure ACGGTA-Pl-LINK-met. The two aliquots 
25 are then pooled to form a mixture of the two 

bifunctional molecules. 

In Step 2, the pool from Step 1 is divided into 
two aliquots. The first aliquot is subjected to a 
sequential coupling as before, adding glycine and the 
30 oligonucleotide Z gLy to form the structures: 

CTCATGCTCATG-Pl-LINK-gly-gly , and 
CTCATGACGGTA-P 1 -LINK-met . gly . 
The second aliquot is subjected to a sequential 
coupling as before, adding methionine and the 
35 oligonucleotide z"** to form the structures: 
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ACGGTACTCATG-Pl-LINK-gly .met , and 
ACGGT AACGGT A-P 1 -LINK-met .met . 
The two aliquots are then pooled to form a mixture of 
the four bifunctional molecules. 
5 In Step 3, the pool from Step 2 is divided into 

two aliquots. The first aliquot is subjected to a 
sequential coupling as before, adding glycine and the 
oligonucleotide Z 9ly . Thereafter, protected 
oligonucleotide P2 is coupled to the growing 
10 bifunctional molecules in the pool as described for a 

single protected nucleotide in Example 4 to form the 
structures : 

P2 CTCATGCTCATGCTCATGP 1-LINK-g 1 y . gly . gly , 
P2CTCATGCTCATGACGGTAP1— LINK-met . gly . gly , 
15 P2CTCATGACGGTACTCATGPl-LINK-gly.met.gly ,and 

P2 CTCATGACGGTAACGGTAPl-LINK-met . met .gly . 
The second aliquot is subjected to a sequential 
coupling as before, adding methionine and the 
oligonucleotide 2 mt . Thereafter, protected 
2 0 oligonucleotide P2 is coupled to the growing 

bifunctional molecules in the pool as described for a 
single protected nucleotide in Example 4 to form the 
structures : 

P2 ACGGTACTCATGCTCATGP 1 -LINK-g ly . gly . met , 

2 5 P2 ACGGTACTCATGACGGTAPl-LINK-met . gly . met , 

P2ACGGTAACGGTACTCATGP1— LINK-gly . met . met , and 
P2 ACGGTAACGGTAACGGTAPl-LINK-met . met . met . 

The two aliquots are then pooled to form a mixture of 

the eight bifunctional molecules. 

3 0 The resulting pool of eight different 

bifunctional molecules represents a small library 
produced according to the methods of this invention. 
By increasing the alphabet size one increases the 
number of aliquots per step. 
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The foregoing specification, including the 
specific embodiments and examples, is intended to be 
illustrative of the present invention and is not to be 
taken as limiting. Numerous other variations and 
5 modifications can be effected without departing from 
the true spirit and scope of the present invention. 
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What Is Claimed Is: 

1. A bif unctional molecule according to the 
formula A-B-C, wherein A is a chemicals moiety , B is a 
linker molecule operatively linked to A and C, and C 

5 is an identifier oligonucleotide comprising a sequence 

of nucleotides which sequence identifies the structure 
of chemical moiety A. 

2. The bif unctional molecule of claim 1 wherein 
A is a polymer comprising a linear series of chemical 

10 units represented by the formula (X n ) a , wherein X is a 

single chemical unit in polymer A; and 
identifier oligonucleotide C is represented by the 
formula (Z n ) a , wherein Z is a unit identifier 
nucleotide sequence within oligonucleotide C that 

15 identifies the chemical unit X at position n; and 

wherein 

n is a position identifier for both X in polymer 
A and Z in oligonucleotide C having the value of l+i. 
where i is an integer from 0 to 10 , such that when n 
20 is l, X or Z is located most proximal to the linker, 

and a is an integer from 4 to 50. 

3. The bif unctional molecule of claim 2 wherein 
said unit identifier nucleotide sequence Z has a 
length of from 2 to 8 nucleotides. 

25 4. The bif unctional molecule of claim 1 wherein 

said chemical moiety A is an oligosaccharide, 
polypeptide, glycolipid, lipid, proteoglycan, 
glycopeptide or oligonucleotide. 

5. The bif unctional molecule of claim 2 wherein 

30 said polymer A is a polypeptide, X is an amino acid 

residue in said polypeptide, and unit identifier 
nucleotide sequence Z is a hexanucleotide sequence 
that identifies the amino acid residue at position n 
in polypeptide A. 

35 6. The bif unctional molecule of claim 5 wherein 
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said amino acid residue is selected from the group 
consisting of natural, modified and non-natural amino 
acids . 

7. The bifunctional molecule of claim 2 wherein 
5 said identifier oligonucleotide C has a nucleotide 

sequence according to the formula Pl-(Z n ) a -P2, where 
PI and P2 are nucleotide sequences that provide PCR 
primer binding sites adapted to amplify the polymer 
identifier oligonucleotide. 
10 8.. The bifunctional molecule of claim 7 wherein 

said PI and P2 each contain a sequence that defines a 
restriction endonuclease site when present in a PCR- 
amplified duplex DNA fragment. 

9. The bifunctional molecule of claim 8 wherein 
15 said restriction sites are located proximal to (Z n ) a 

relative to the PCR primer binding sites. 

10 • The bifunctional molecule of claim 9 wherein 
said restriction endonuclease sites form non- 
overlapping cohesive termini upon restriction 
20 endonuclease cleavage. 

11. A library comprising a plurality of species 
of bifunctional molecules according to claim l. 

12* The library of claim 11 wherein said 
plurality of species is defined by the formula V% 
25 where V represents the number of different chemical 
units forming an alphabet of possible chemical units 
of X, and a is an exponent to V and represents the 
number of chemical units of X forming polymer A. 

13. The library of claim 12 wherein X is a 
3 0 natural amino acid and V is 20. 

14. The library of claim 12 wherein the number 
of chemical units (a) forming polymer A can be from 3 
to 8. 

15. The library of claim 12 wherein X is an 
35 amino acid and a is 6. 
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16. The library of claim 12 wherein X is an 
amino acid and said unit identifier nucleotide 
sequence Z has a length from 3 to 6 nucleotides. 

17. The library of claim 11 wherein each of said 
5 species of bifunctional molecules in said plurality is 

present in molar equivalents of from 0.2 to 10.0. 

18. The library of claim 12 wherein said 
identifier oligonucleotide C in each of said species 
of bifunctional molecules has a nucleotide sequence 

10 according to the formula Pl-(Z n ) a -P2, where PI and P2 

are nucleotide sequences that provide PCR primer 
binding sites adapted to amplify the identifier 
oligonucleotide, and where the nucleotide sequences of 
Pi and P2 are shared by all bifunctional molecule 

15 species in the library. 

19. A method for identifying a chemical 
structure that participates in a preselected binding 
interaction with a biologically active molecule, said 
chemical structure being present in a library of 

20 bifunctional molecules according to claim 11, 

comprising the steps of: 

a) admixing in solution said library of 
bifunctional molecules with the biologically active 
molecule under binding conditions for a time period 

25 sufficient to form a binding reaction complex; 

b) isolating the complex formed in step 

(a) ; and 

c) determining the nucleotide sequence of 
the identifier oligonucleotide in the isolated complex 

30 and thereby identifying the chemical structure that 

participated in the preselected binding interaction. 

20. The method of claim 19 wherein said 
biologically active molecule is affixed to a solid 
support . 

35 21. The method of claim 19 wherein said 
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biologically active molecule is operatively linked to 
a binding means capable of binding a binder molecule. 

22. The method of claim 21 wherein said binding 
means is selected from the group consisting of biotin, 

5 protein A and magnetic beads. 

23. The method of claim 19 wherein said 
determining comprises the steps of: 

i) forming a polymerase chain reaction 
(FCR) amplification product from the sequence of the 

10 isolated identifier oligonucleotide; and 

ii) determining the sequence of the PGR 
amplification product, thereby determining the 
sequence of the identifier oligonucleotide* 

24. A method for preparing a library comprising 
15 a plurality of bifunctional molecules according to 

claim 11 comprising the steps of: 

a) providing a linker molecule B having 
termini A 1 and C r according to the formula A'-B-C 1 
that is adapted for reaction with a chemical precursor 

20 unit X' at termini A* and with a nucleotide precursor 

Z * at termini C • ; 

b) conducting syntheses by adding chemical 
precursor unit X 1 to termini A 1 of said linker and 
adding precursor unit identifier oligonucleotide Z 1 to 

25 termini C of said linker, to form a composition 

containing bifunctional molecules having the structure 

X n- B -Zn? 

c) repeating step (b) on one or more 
aliquots of the composition to produce aliquots that 

30 contain a product containing a bifunctional molecule; 

d) combining the aliquots produced in step 
(c) to form an admixture of bifunctional molecules, 
thereby forming said library. 

25. The method of claim 24 wherein said steps 
35 (c) and (d) are repeated on the admixture of step (d) 
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to add an additional chemical unit X and corresponding 
identifier oligonucleotide Z to the bifunctional 
molecules in the admixture. 

26. The method of claim 25 wherein said 

5 repetition of steps (c) and (d) are repeated on the 

admixture from 1 to 6 times, thereby forming a polymer 
A on said bifunctional molecules such that a is from 3 
to 10. 

27. The method of claim 19 wherein said linker 
10 molecule is a bifunctional solid support selected from 

the group consisting of bf-CPG or o-NB-bf-CPG. 

28. The method of claim 24 wherein said linker 
molecule is a bifunctional solid support selected from 
the group consisting of bf-CPG or o-NB-bf-CPG. 

15 29. A bifunctional solid support for 

synthesizing an oligopeptide/ oligonucleotide 

conjugate comprising: 

a solid support, said solid support being of 

a type which is dispersible in aqueous solution, 
20 a first linkage unit coupled to said solid 

support , 

a second linkage unit coupled to said first 
linkage unit, and 

a bifunctional unit coupled to said second 
25 linkage unit, 

said bifunctional unit having a first 
leaving group employable for oligopeptide synthesis 
and a second leaving group employable for 
oligonucleotide synthesis, 
30 said first leaving group being N-FMOC or its 

functional equivalent, 

said second leaving group being O-DMT or its 
functional equivalent, 

said second linkage unit being coupled to 
35 said first linkage unit by means of a bond cleavable 
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by exposure to concentrated aqueous ammonia, 

said solid support , said first linkage unit, 
said second linkage unit, said cleavable bond, and 
said bifunctional unit, exclusive of said first and 
5 second leaving groups, each being substantially 

chemically unreactive to conditions employed by 
oligopeptide synthetic protocols using FMOC leaving 
groups and conditions employed by oligonucleotide 
synthetic protocols using O-DMT leaving groups. 
10 30* A bifunctional solid support as described in 

claim 29 wherein said solid support is a controlled 
pore glass. 

31. A bifunctional solid support as described in 
claim 29 wherein said bond between said first and said 

15 second linkage units is an alkyl ester. 

32. A bifunctional solid support as described in 
claim 29 wherein said bifunctional unit being a serine 
residue having an amino end, a car boxy 1 end, and a 
hydroxyl end, said serine being coupled at its 

20 carboxyl end to said second linkage unit, being 

coupled at its amino end to said first leaving group, 
and being coupled at its hydroxyl end to said second 
leaving group. 

33. A bifunctional solid support as described in 
25 claim 29 wherein: 

said solid support is controlled pore glass, 
said bond between said first and said second 
linkage units is an alkyl ester, and 

said bifunctional unit being a serine 
30 residue having an amino end, a carboxyl end, and a 

hydroxyl end, said serine being coupled at its 
carboxyl end to said second linkage unit, being 
coupled at its amino end to said first leaving group, 
and being coupled at its hydroxyl end to said second 
35 leaving group. 
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34. A bifunctional solid support as described in 
claim 33 wherein: 

said solid support is aminopropyl-CPG, 
said first linkage unit includes a sarcosine 
5 linker coupled to the aminopropyl-CPG and a succinyl 

linker coupled to the sarcosine linker by means of an 
amide bond, 

said second linkage unit includes an 
aminohexanol group coupled to said succinyl linker by 
10 means of an alkyl ester, 

said bifunctional unit includes an L-serine 
residue, the amino end of the serine being coupled by 
means of an amide bond to said aminohexanol linker, 
the carboxyl end of said serine being coupled to an 
15 FMOC leaving group, the hydroxy 1 end of said serine 

being coupled to an O-DMT leaving group. 

35. A bifunctional solid support as described in 
claim 29 further comprising: 

a third linkage unit interposed between and 
20 coupled to said bifunctional unit and said first 

leaving group, 

said third linkage unit being cleavable by 
exposure to ultra- violet light. 

36. A bifunctional solid support as described in 
25 claim 35 wherein said third linkage unit includes a 3- 

nitro-4-0-ethyl benzoate group coupled by means of an 
amide bond to the amino end of said serine and coupled 
by means of an ester bond to an FMOC blocked amino 
acid. 

30 37. A bifunctional solid support as described in 

claim 3 6 wherein: 

said solid support is aminopropyl-CPG, 
said first linkage unit includes a sarcosine 
linker coupled to the aminopropyl-CPG and a succinyl 
35 linker coupled to the sarcosine linker by means of an 
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amide bond, 

said second linkage unit includes an 
aminohexanol group coupled to said succinyl linker by 
means of an alkyl ester , and 
5 said bifunctional unit includes an L-serine 

residue, the amino end of the serine being coupled by 
means of an amide bond to said aminohexanol linker, 
the carboxyl end of said serine being coupled to an 
FMOC leaving group, the hydroxy 1 end of said serine 
10 being coupled to an O-DMT leaving group. 

38, An element of a library of 
oligopeptide/ oligonucleotide conjugates comprising: 

a solid support, said solid support being of 
a type which is dispersible in aqueous solution, 
15 a first linkage unit coupled to said solid 

support, 

a second linkage unit coupled to said first 
linkage unit, 

a bifunctional unit coupled to said second 
20 linkage unit, 

an oligopeptide attached to said 
bifunctional unit, and 

an oligonucleotide attached to said 
bifunctional unit. 
25 39. An element of a library of 

oligopeptide/ oligonucleotide conjugates as described 
in claim 38 further comprising a cleavable bond for . 
coupling said first linkage unit to said second 
linkage unit, said cleavable bond being cleavable by 
30 exposure to concentrated aqueous ammonia. 

40. An element of a library of 
oligopeptide/oligonucleotide conjugates as described 
in claim 39 further comprising a cleavable bond for 
coupling said bifunctional unit to said oligopeptide, 
35 said cleavable bond being cleavable by exposure to 
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ultraviolet light. 

41. An element of a library of 
oligopeptide/oligonucleotide conjugates comprising : 

a bifunctional unit, 
5 an oligopeptide attached to said 

bifunctional unit, and 

an oligonucleotide attached to said 
bifunctional unit. 

42. An element of a library of 

10 oligopeptide/oligonucleotide conjugates as described 

in claim 41 further comprising a cleavable bond for 
coupling said bifunctional unit to said oligopeptide, 
said cleavable bond being cleavable by exposure to 
ultraviolet light. 

15 
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PI - LINK 

| Step 1 

CTCATG-Pl-LINK-gly 
ACGGTA-Pl-LINK-met 

j Step 2 

CTCATGCTCATG-Pl-LINK-gly . gly 
CT CATG A CGGT A - P 1 - LI NK -in e t . gly 

ACGGTACTCATG-Pi-LINK-gly . met 
ACGGTAACGGTA-Pl-LINK-met . met 

J Step 3 

P2 CTCATGCTCATGCTCATGP1— LINK— gly . gly . gly 
P2 CTCATGCTCATGACGGTAPl -LINK-me t . gly . gly 
P2 CTCATGACGGTACTCATGP1-LINK— gly . met . gly 
P2 CTCATGACGGTAACGGTAP 1 -LINK -me t . met . gly 

P2ACGGTACTCATGCTCATGPl-LINK-gly . gly . met 
P2 ACGGTACTCATGACGGTAP 1 - LINK-me t . gly .met 
P2ACGGTAACGGTACTCATGPl-LINK-gly .met . met 
P2 ACGGTAACGGTAACGGTAP1- LINK-me t .met . met 



PI = GGGCCCTATTCTTAG 
P2 = AGCTACTTCCCAAGG 



FIG. 2 
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